Bypassing machine learning with reinforcement learning
In the previous technique, we noticed that if we are generating adversarial samples, especially if the outcomes are binaries, we will face some issues, including generating invalid samples. Information security researchers have come up with a new technique to bypass machine learning anti-malware systems with reinforcement learning.
Reinforcement learning
Previously (especially in the first chapter), we explored the different machine learning models: supervised, semi-supervised, unsupervised, and reinforcement models. Reinforcement machine learning models are important approaches to building intelligent machines. In reinforcement learning, an agent learns through experience, by interacting with an environment; it chooses the best decision based on a state and a reward function:

A famous example of reinforcement learning is the AI-based Atari Breakout. In this case, the environment includes the following:
- The ball and the bricks
- The moving...