Packt+ | Advance your knowledge in tech

You're reading from Python Web Scraping Cookbook Over 90 proven recipes to get you scraping with Python, microservices, Docker, and AWS

Product type Paperback

Published in Feb 2018

Publisher Packt

ISBN-13 9781787285217

Length 364 pages

Edition 1st Edition

Languages

Python

Tools

AWS

Concepts

Data Mining

Author (1):

Michael Heydt

View More author details

Table of Contents (18) Chapters

Title Page

Contributors

Packt Upsell

Preface

1. Getting Started with Scraping FREE CHAPTER

2. Data Acquisition and Extraction

3. Processing Data

4. Working with Images, Audio, and other Assets

5. Scraping - Code of Conduct

6. Scraping Challenges and Solutions

7. Text Wrangling and Analysis

8. Searching, Mining and Visualizing Data

9. Creating a Simple Data API

10. Creating Scraper Microservices with Docker

11. Making the Scraper as a Service Real

1. Other Books You May Enjoy

Leave a review - let other readers know what you think

Index

Checking Elasticsearch for a listing before scraping

Now lets leverage Elasticsearch as a cache by checking to see if we already have stored a job listing and hence do not need to hit StackOverflow again. We extend the API for performing a scrape of a job listing to first search Elasticsearch, and if the result is found there we return that data. Hence, we optimize the process by making Elasticsearch a job listings cache.

How to do it

We proceed with the recipe as follows:

The code for this recipe is within 09/05/api.py. The JobListing class now has the following implementation:

class JobListing(Resource):
    def get(self, job_listing_id):
        print("Request for job listing with id: " + job_listing_id)

        es = Elasticsearch()
        if (es.exists(index='joblistings', doc_type='job-listing', id=job_listing_id)):
            print('Found the document in ElasticSearch')
            doc =  es.get(index='joblistings', doc_type='job-listing', id=job_listing_id)
            return doc...

The rest of the chapter is locked

Tech Concepts

Programming languages

Tech Tools

Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.

Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.

50+ new titles added per month and exclusive early access to books as they are being written.

You're reading from Python Web Scraping Cookbook Over 90 proven recipes to get you scraping with Python, microservices, Docker, and AWS

Table of Contents (18) Chapters

Checking Elasticsearch for a listing before scraping

How to do it

Authors (1)

Other recommended products

Personalised recommendations for you

You're reading from Python Web Scraping Cookbook Over 90 proven recipes to get you scraping with Python, microservices, Docker, and AWS

Table of Contents (18) Chapters

Checking Elasticsearch for a listing before scraping

How to do it

Authors (1)

Other recommended products

Personalised recommendations for you

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access