Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Mastering Machine Learning for Penetration Testing
Mastering Machine Learning for Penetration Testing

Mastering Machine Learning for Penetration Testing: Develop an extensive skill set to break self-learning systems using Python

Arrow left icon
Profile Icon Chiheb Chebbi
Arrow right icon
$12.99 per month
Full star icon Full star icon Full star icon Full star icon Empty star icon 4 (4 Ratings)
Paperback Jun 2018 276 pages 1st Edition
eBook
$35.99
Paperback
$43.99
Subscription
Free Trial
Renews at $12.99p/m
Arrow left icon
Profile Icon Chiheb Chebbi
Arrow right icon
$12.99 per month
Full star icon Full star icon Full star icon Full star icon Empty star icon 4 (4 Ratings)
Paperback Jun 2018 276 pages 1st Edition
eBook
$35.99
Paperback
$43.99
Subscription
Free Trial
Renews at $12.99p/m
eBook
$35.99
Paperback
$43.99
Subscription
Free Trial
Renews at $12.99p/m

What do you get with a Packt Subscription?

Free for first 7 days. $15.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing
Table of content icon View table of contents Preview book icon Preview Book

Mastering Machine Learning for Penetration Testing

Chapter 2. Phishing Domain Detection

Social engineering is one of the most dangerous threats facing every individual and modern organization. Phishing is a well-known, computer-based, social engineering technique. Attackers use disguised email addresses as a weapon to target large companies. With the huge number of phishing emails received every day, companies are not able to detect all of them. That is why new techniques and safeguards are needed to defend against phishing. This chapter will present the steps required to build three different machine learning-based projects to detect phishing attempts, using cutting-edge Python machine learning libraries.

In this chapter, we will cover:

  • A social engineering overview
  • The steps for social engineering penetration testing
  • Building a real-time phishing attack detector using different machine learning models:
    • Phishing detection with logistic regression
    • Phishing detection with decision trees
    • Spam email detection with natural language processing (NLP...

Technical requirements


In this chapter, we are going to use the following Python libraries:

  • scikit-learn Python (≥ 2.7 or ≥ 3.3)
  • NumPy  (≥ 1.8.2)
  • NLTK

If you have not installed them yet, please make sure that they are installed before moving forward with this chapter. You can find the code files at https://github.com/PacktPublishing/Mastering-Machine-Learning-for-Penetration-Testing/tree/master/Chapter02.

Social engineering overview


Social engineering, by definition, is the psychological manipulation of a person to get useful and sensitive information from them, which can later be used to compromise a system. In other words, criminals use social engineering to gain confidential information from people, by taking advantage of human behavior.

Social Engineering Engagement Framework

The Social Engineering Engagement Framework (SEEF) is a framework developed by Dominique C. Brack and Alexander Bahmram. It summarizes years of experience in information security and defending against social engineering. The stakeholders of the framework are organizations, governments, and individuals (personals). Social engineering engagement management goes through three steps: 

  1. Pre-engagement process: Preparing the social engineering operation
  2. During-engagement process: The engagement occurs
  3. Post-engagement process: Delivering a report

There are many social engineering techniques used by criminals:

  • Baiting: Convincing...

Steps of social engineering penetration testing


Penetration testing simulates a black hat hacker attack in order to evaluate the security posture of a company for deploying the required safeguard. Penetration testing is a methodological process, and it goes through well-defined steps. There are many types of penetration testing:

  • White box pentesting
  • Black box pentesting 
  • Grey box pentesting 

To perform a social engineering penetration test, you need to follow the following steps:

Building real-time phishing attack detectors using different machine learning models


In the next sections, we are going to learn how to build machine learning phishing detectors. We will cover the following two methods:

  • Phishing detection with logistic regression
  • Phishing detection with decision trees

Phishing detection with logistic regression

In this section, we are going to build a phishing detector from scratch with a logistic regression algorithm. Logistic regression is a well-known statistical technique used to make binomial predictions (two classes).

Like in every machine learning project, we will need data to feed our machine learning model. For our model, we are going to use the UCI Machine Learning Repository (Phishing Websites Data Set). You can check it out at https://archive.ics.uci.edu/ml/datasets/Phishing+Websites:

The dataset is provided as an arff file:

The following is a snapshot from the dataset:

For better manipulation, we have organized the dataset into a csv file:

As you probably...

NLP in-depth overview


NLP is the art of analyzing and understanding human languages by machines. According to many studies, more than 75% of the used data is unstructured. Unstructured data does not have a predefined data model or not organized in a predefined manner. Emails, tweets, daily messages and even our recorded speeches are forms of unstructured data. NLP is a way for machines to analyze, understand, and derive meaning from natural language. NLP is widely used in many fields and applications, such as:

  • Real-time translation
  • Automatic summarization
  • Sentiment analysis
  • Speech recognition
  • Build chatbots

Generally, there are two different components of NLP:

  • Natural Language Understanding (NLU): This refers to mapping input into a useful representation.
  • Natural Language Generation (NLG): This refers to transforming internal representations into useful representations. In other words, it is transforming data into written or spoken narrative. Written analysis for business intelligence dashboards...

Summary


In this chapter, we learned to detect phishing attempts by building three different projects from scratch. First, we discovered how to develop a phishing detector using two different machine learning techniques, thanks to cutting-edge Python machine learning libraries. The third project was a spam filter, based on NLP and Naive Bayes classification. In the next chapter, we will build various projects to detect malware, using different techniques and Python machine learning libraries.

Questions


We hope it was easy to go through this chapter. Now, as usual, it is practice time. Your job is to try building your own spam detection system. We will guide you through the questions.

In this chapter's GitHub repository, you will find a dataset collected from research done by Androutsopoulos, J. Koutsias, K.V. Chandrinos, George Paliouras, and C.D. Spyropoulos: An Evaluation of Naive Bayesian Anti-Spam Filtering. Proceedings of the workshop on Machine Learning in the New Information Age, G. Potamias, V. Moustakis and M. van Someren (eds.), 11th European Conference on Machine Learning, Barcelona, Spain, pp. 9-17, 2000.

You can now prepare the data:

  1. The following are some text-cleaning tasks to perform:
    • Clean your texts of stopwords, digits, and punctuation marks.
    • Perform lemmatization.
  2. Create a word dictionary, including their frequencies.

Note

In email texts, you will notice that the first line is the subject of the email and the third line is the body of the email (we only need the...

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • • Identify ambiguities and breach intelligent security systems
  • • Perform unique cyber attacks to breach robust systems
  • • Learn to leverage machine learning algorithms

Description

Cyber security is crucial for both businesses and individuals. As systems are getting smarter, we now see machine learning interrupting computer security. With the adoption of machine learning in upcoming security products, it’s important for pentesters and security researchers to understand how these systems work, and to breach them for testing purposes. This book begins with the basics of machine learning and the algorithms used to build robust systems. Once you’ve gained a fair understanding of how security products leverage machine learning, you'll dive into the core concepts of breaching such systems. Through practical use cases, you’ll see how to find loopholes and surpass a self-learning security system. As you make your way through the chapters, you’ll focus on topics such as network intrusion detection and AV and IDS evasion. We’ll also cover the best practices when identifying ambiguities, and extensive techniques to breach an intelligent system. By the end of this book, you will be well-versed with identifying loopholes in a self-learning security system and will be able to efficiently breach a machine learning system.

Who is this book for?

This book is for pen testers and security professionals who are interested in learning techniques to break an intelligent security system. Basic knowledge of Python is needed, but no prior knowledge of machine learning is necessary.

What you will learn

  • •Take an in-depth look at machine learning
  • •Get to know natural language processing (NLP)
  • •Understand malware feature engineering
  • •Build generative adversarial networks using Python libraries
  • •Work on threat hunting with machine learning and the ELK stack
  • •Explore the best practices for machine learning

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Jun 27, 2018
Length: 276 pages
Edition : 1st
Language : English
ISBN-13 : 9781788997409
Category :
Languages :

What do you get with a Packt Subscription?

Free for first 7 days. $15.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details

Publication date : Jun 27, 2018
Length: 276 pages
Edition : 1st
Language : English
ISBN-13 : 9781788997409
Category :
Languages :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$12.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 6,500+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$129.99 billed annually
Feature tick icon Unlimited access to Packt's library of 6,500+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$179.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 6,500+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $ 147.97
Mastering Machine Learning for Penetration Testing
$43.99
Learning Malware Analysis
$54.99
Hands-On Machine Learning for Cybersecurity
$48.99
Total $ 147.97 Stars icon
Visually different images

Table of Contents

10 Chapters
Introduction to Machine Learning in Pentesting Chevron down icon Chevron up icon
Phishing Domain Detection Chevron down icon Chevron up icon
Malware Detection with API Calls and PE Headers Chevron down icon Chevron up icon
Malware Detection with Deep Learning Chevron down icon Chevron up icon
Botnet Detection with Machine Learning Chevron down icon Chevron up icon
Machine Learning in Anomaly Detection Systems Chevron down icon Chevron up icon
Detecting Advanced Persistent Threats Chevron down icon Chevron up icon
Evading Intrusion Detection Systems Chevron down icon Chevron up icon
Bypassing Machine Learning Malware Detectors Chevron down icon Chevron up icon
Best Practices for Machine Learning and Feature Engineering Chevron down icon Chevron up icon

Customer reviews

Rating distribution
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
(4 Ratings)
5 star 50%
4 star 25%
3 star 0%
2 star 25%
1 star 0%
Houssem Dellai Jul 23, 2018
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Before reqding a book, I search for the qualifications of the author. And clearly here he shows up his motivation and willing to share through international conferences and his second book !
Amazon Verified review Amazon
priyabrata mohanty Jan 13, 2019
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Gives you a good guidance on where to start.
Amazon Verified review Amazon
Alex Oct 31, 2018
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
For me, this was a good introduction to Machine Learning with regards to Pentesting. Some interesting concepts and just enough information to wet your appetite to look further into ML. Recommend it for anyone with an interest to get a good base to start at.
Amazon Verified review Amazon
sipy Aug 14, 2018
Full star icon Full star icon Empty star icon Empty star icon Empty star icon 2
While this book does an excellent job of describing many aspects of Machine Learning, it doesn't live up to its title of showing you how to apply M.L. techniques to Penetration Testing.**HOWEVER** If this book was retitled: "M.L. for Dummies", I would give it 5 stars!!! Note that I'm not mocking the author or potential reader - it really is an **excellent** introduction for newcomers to the M.L. field - exactly what the "...for Dummies" book series is intended to be!The vast majority of this book is spent describing concepts and algorithms for M.L., which I feel the author is very technically qualified to teach. His writing style is very approachable, and he is very, very adept at relating these complicated topics to newcomers in the M.L. field. In fact, I don't recall another source that can convey, in so few words, these deeply technical thoughts as easily as this book.The closest thing in this book describing M.L. "for Pentesting" is probably the 40 pages of chapters 8 and 9, which mostly name-drop various tools that can be used to investigate perturbing training data to influence an M.L. model's classification capabilities. No mention is made in this section on how to use these tools and techniques for *Penetration* *Testing* - only for perturbing training data, which the author doesn't describe how to utilize for an actual Pentest-type of attack!Chapter 10 may be the most undersold chapter in this book. It covers best practices for Feature Engineering - something that you *must* master in order to use M.L. The lucid descriptions in this chapter, alone, almost justified my purchase price!Again, I think this is a great book written by a very competent author who has a gift for conveying very detailed technical matter in an easy-to-grasp way. His writing style is very approachable, and this book is an easy read for the newcomer to M.L.In fact, if you are new to M.L., I recommend reading this book, first, to see if it is even something that you may want to partake of.But - again - this book is *not* a "Machine Learning for Pentesting" book. It doesn't come close to hitting this mark, in my opinion.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is included in a Packt subscription? Chevron down icon Chevron up icon

A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content

How can I cancel my subscription? Chevron down icon Chevron up icon

To cancel your subscription with us simply go to the account page - found in the top right of the page or at https://subscription.packtpub.com/my-account/subscription - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.

What are credits? Chevron down icon Chevron up icon

Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - subscription.packtpub.com - clicking on ‘the my’ library dropdown and selecting ‘credits’.

What happens if an Early Access Course is cancelled? Chevron down icon Chevron up icon

Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.

Where can I send feedback about an Early Access title? Chevron down icon Chevron up icon

If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team. 

Can I download the code files for Early Access titles? Chevron down icon Chevron up icon

We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.

When we publish the book, the code files will also be available to download from the Packt website.

How accurate is the publication date? Chevron down icon Chevron up icon

The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.

How will I know when new chapters are ready? Chevron down icon Chevron up icon

We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.

I am a Packt subscriber, do I get Early Access? Chevron down icon Chevron up icon

Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.

How is Early Access delivered? Chevron down icon Chevron up icon

Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.

How do I buy Early Access content? Chevron down icon Chevron up icon

Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.

What is Early Access? Chevron down icon Chevron up icon

Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.