Chapter 8. Beating CAPTCHAs with Neural Networks
Images pose interesting and difficult challenges for data miners. Until recently, only small amounts of progress were made with analyzing images for extracting information. However recently, such as with the progress made on self-driving cars, significant advances have been made in a very short time-frame. The latest research is providing algorithms that can understand images for commercial surveillance, self-driving vehicles, and person identification.
There is lots of raw data in an image, and the standard method for encoding images - pixels - isn't that informative by itself. Images and photos can be blurry, too close to the targets, too dark, too light, scaled, cropped, skewed, or any other of a variety of problems that cause havoc for a computer system trying to extract useful information. Neural networks can combine these lower level features into higher level patterns that are more able to generalize and deal with these issues.
In this...