Machine learning in penetration testing - promises and challenges
Machine learning is now a necessary aspect of every modern project. Combining mathematics and cutting-edge optimization techniques and tools can provide amazing results. Applying machine learning and analytics to information security is a step forward in defending against advanced real-world attacks and threats.
Hackers are always trying to use new, sophisticated techniques to attack modern organizations. Thus, as security professionals, we need to keep ourselves updated and deploy the required safeguards to protect assets. Many researchers have shown thousands of proposals to build defensive systems based on machine learning techniques. For example, the following are some information security models:
- Supervised learning:
- Network traffic profiling
- Spam filtering
- Malware detection
- Semi-supervised learning:
- Network anomaly detection
- C2 detection
- Unsupervised learning:
- User behavior analytics
- Insider threat detection
- Malware family identification
As you can see, there are great applications to help protect the valuable assets of modern organizations. But generally, black hat hackers do not use classic techniques anymore. Nowadays, the use of machine learning techniques is shifting from defensive techniques to offensive systems. We are moving from a defensive to an offensive position. In fact, building defensive layers with artificial intelligence and machine learning alone is not enough; having an understanding of how to leverage those techniques to perform ferocious attacks is needed, and should be added to your technical skills when performing penetration testing missions. Adding offensive machine learning tools to your pentesting arsenal is very useful when it comes to simulating cutting-edge attacks. While a lot of these offensive applications are still for research purposes, we will try to build our own projects, to get a glimpse of how attackers are building offensive tools and cyber weapons to attack modern companies. Maybe you can use them later, in your penetration testing operations.
Deep Exploit
Many great publicly available tools appeared lately that use machine learning capabilities to leverage penetration testing to another level. One of these tools is Deep Exploit. It was presented at black hat conference 2018. It is a fully automated penetration test tool linked with metasploit. This great tool uses uses reinforcement learning (self-learning).

It is able to perform the following tasks:
- Intelligence gathering
- Threat modeling
- Vulnerability analysis
- Exploitation
- Post-exploitation
- Reporting
To download Deep Exploit visit its official GitHub repository: https://github.com/13o-bbr-bbq/machine_learning_security/tree/master/DeepExploit.
It is consists of a machine learning model (A3C) and metasploit. This is a high level overview of Deep Exploit architecture:

The required environment to make Deep Exploit works properly is the following:
- Kali Linux 2017.3 (guest OS on VMWare)
- Memory: 8.0GB
- Metasploit framework 4.16.15-dev
- Windows 10 Home 64-bit (Host OS)
- CPU: Intel(R) Core(TM) i7-6500U 2.50GHz
- Memory: 16.0GB
- Python 3.6.1 (Anaconda3)
- TensorFlow 1.4.0
- Keras 2.1.2