Decision trees and random forests
Tree-based models are very different from the previous types of models that we have discussed, but they are widely utilized and very powerful. You can think about a decision tree model like a series of if-then
statements applied to your data. When you train this type of model, you are constructing a series of control flow statements that eventually allow you to classify records.
Decision trees are implemented in github.com/sjwhitworth/golearn
and github.com/xlvector/hector
, among others, and random forests are implemented in github.com/sjwhitworth/golearn
, github.com/xlvector/hector
, and github.com/ryanbressler/CloudForest
, among others. We will utilize github.com/sjwhitworth/golearn
again in our examples shown in the following section.
Overview of decision trees and random forests
Again, consider our classes A and B. In this case, suppose that we have one feature, x1, that ranges from 0.0 to 1.0, and we have another feature, x2 that is categorical and can...