Scraping a job listing from StackOverflow
Now let's pull a bit of this together to scrape information from a StackOverflow job listing. We are going to look at just one listing at this time so that we can learn the structure of these pages and pull information from them. In later chapters, we will look at aggregating results from multiple listings. Let's now just learn how to do this.
Getting ready
StackOverflow actually makes it quite easy to scrape data from their pages. We are going to use content from a posting at https://stackoverflow.com/jobs/122517/spacex-enterprise-software-engineer-full-stack-spacex?so=p&sec=True&pg=1&offset=22&cl=Amazon%3b+. This likely will not be available at the time you read it, so I've included the HTML of this page in the 07/spacex-job-listing.html
file, which we will use for the examples in this chapter.
StackOverflow job listings pages are very structured. It's probably because they're created by programmers and for programmers. The...