Chapter 3. Segregating legitimate and lousy URLs
A URL stands for a uniform resource locator. A URL is essentially the address of a web page located in the world wide web. URLs are usually displayed in the web browser's address bar. A uniform resource locater confirms to the following address scheme:
scheme:[//[user[:password]@]host[:port]][/path][?query][#fragment]

Url could include either the hyper text transfer protocol (http) or the hyper text transfer secure protocol (https). Other type of protocols include the file transfer protocol(ftp), simple mail transfer protocol or (SMTP) and others like telnet, DNS and so on. A url also consists of the top level domain, hostname, paths and port informations.