Using kernel PCA for nonlinear dimensionality reduction
Most of the techniques in statistics are linear by nature, so in order to capture nonlinearity, we might need to apply some transformation. PCA is, of course, a linear transformation. In this recipe, we'll look at applying nonlinear transformations, and then apply PCA for dimensionality reduction.
Getting ready
Life would be so easy if data was always linearly separable, but unfortunately, it's not. Kernel PCA can help to circumvent this issue. Data is first run through the kernel function that projects the data onto a different space; then, PCA is performed.
To familiarize yourself with the kernel functions, it will be a good exercise to think of how to generate data that is separable by the kernel functions available in the kernel PCA. Here, we'll do that with the cosine kernel. This recipe will have a bit more theory than the previous recipes.
Before starting, load the iris dataset:
from sklearn import datasets, decomposition iris = datasets...