Chapter 6. Interacting with Forms
In earlier chapters, we downloaded static web pages that return the same content. In this chapter, we will interact with web pages which depend on user input and state to return relevant content. This chapter will cover the following topics:
- Sending a
POST
request to submit a form - Using cookies and sessions to log in to a website
- Using Selenium for form submissions
To interact with these forms, you'll need a user account to log in to the website. You can register an account manually at http://example.webscraping.com/user/register. Unfortunately, we can't yet automate the registration form until the next chapter, which deals with CAPTCHA
images.
Note
Form methods
HTML forms define two methods for submitting data to the server-GET
and POST
. With the GET
method, data such as ?name1=value1&name2=value2
is appended to the URL, which is known as a "query string". The browser sets a limit on the URL length, so this is only useful for small amounts of data. Additionally...