Download web pages with Python scripts
To download web pages from the web server, the urllib
module, which is part of the standard Python library, can be used urllib
includes functions for retrieving data from URLs.
Getting ready
To learn the basics, we could use the Python interactive terminal. Type python
in your Terminal window and press Enter. This will open up the Python (Python 2.x) interactive terminal.
How to do it...
There are some differences in commands for doing this in Python 2.x and Python 3.x, mainly with the print
statements. So please note the difference in the syntax. This will be helpful in our upcoming recipes.
With Python 2
- First, import the required module,
urllib
:
>>> import urllib
- With the
urlopen
method, you can download the web page:
>>> webpage = urllib.urlopen("https://www.packtpub.com/")
- We can read the file like a returned object with the
read
method:
>>> source = webpage.read()
- Close the object when it's done:
>>> webpage.close()
- Now...