Introducing the Naïve Bayes classifier
Naïve Bayes is probably one of the most elegant machine learning algorithms out there that is of practical use. And despite its name, it is not that naïve when you look at its classification performance. It proves to be quite robust to irrelevant features, which it kindly ignores. It learns fast and predicts equally so. It does not require lots of storage. So, why is it called naïve?
The Naïve was added to account for one assumption that is required for Naïve Bayes to work optimally. The assumption is that the features are uncorrelated. This, however, is rarely the case for real-world applications. Nevertheless, it still returns very good accuracy in practice, even when the independence assumption does not hold.
Getting to know the Bayes theorem
At its core, the Naïve Bayes classification is nothing more than keeping track of which feature gives evidence to which class. The way the features are designed determines the model that is used to learn. The so...