Packt+ | Advance your knowledge in tech

You're reading from Mastering Machine Learning for Penetration Testing Develop an extensive skill set to break self-learning systems using Python

Product type Paperback

Published in Jun 2018

Publisher Packt

ISBN-13 9781788997409

Length 276 pages

Edition 1st Edition

Languages

Python

Concepts

Machine Learning

Author (1):

Chiheb Chebbi

View More author details

Table of Contents (18) Chapters

Title Page

Dedication

Packt Upsell

Contributors

Preface

1. Introduction to Machine Learning in Pentesting

2. Phishing Domain Detection FREE CHAPTER

3. Malware Detection with API Calls and PE Headers

4. Malware Detection with Deep Learning

5. Botnet Detection with Machine Learning

6. Machine Learning in Anomaly Detection Systems

7. Detecting Advanced Persistent Threats

8. Evading Intrusion Detection Systems

9. Bypassing Machine Learning Malware Detectors

10. Best Practices for Machine Learning and Feature Engineering

1. Assessments

2. Other Books You May Enjoy

Leave a review - let other readers know what you think

Index

Chapter 10 – Best Practices for Machine Learning and Feature Engineering

What is the difference between feature engineering and feature selection?

Feature selection is a part of feature engineering.

What is the difference between principal component analysis (PCA) and feature selection?

Feature selection takes the dataset and gives us the best set of features, while PCA is a dimensionality reduction method.

How can we encode features like dates and hours?

One of the techniques is adding the (sine, cosine) transformation of the time of day variable.

Why it is useful to print out training and testing accuracy?

It is useful to detect overfitting by comparing the two metrics.

How can we deploy a machine learning model and use it in a product?

There are many ways to take a machine learning model to production, such as web services and containerization depending on your model (Online, offline? Deep learning, SVM, Naive Bayes?).

Why does feature engineering take much more time than other steps?

Because analyzing, cleaning, and processing features takes more time than building the model.

What is the role of a dummy variable?

A dummy variable is a numerical variable used in regression analysis to represent subgroups of the sample in your study. In research design, a dummy variable is often used to distinguish between different treatment groups.