Performing feature selection with FSelector
The FSelector
package provides two approaches to select the most influential features from the original feature set. Firstly, rank features using certain criteria and select the ones that are above a defined threshold. Secondly, search for optimum feature subsets from a space of feature subsets. In this recipe, we will introduce how to perform feature selection with the FSelector
package.
Getting ready
In this recipe, we will continue to use the telecom churn
dataset as the input data source to train the support vector machine. For those who have not prepared the dataset, please refer to Chapter 7, Classification 1 - Tree, Lazy, and Probabilistic, for detailed information.
How to do it...
Perform the following steps to perform feature selection on a churn
dataset:
- First, install and load the package,
FSelector
:
> install.packages("FSelector")> library(FSelector)
- Then, we can use
random.forest.importance
to calculate the weight for each attribute...