Using truncated SVD to reduce dimensionality
Truncated SVD is a matrix factorization technique that factors a matrix M into the three matrices U, Σ, and V. This is very similar to PCA, except that the factorization for SVD is done on the data matrix, whereas for PCA, the factorization is done on the covariance matrix. Typically, SVD is used under the hood to find the principle components of a matrix.
Getting ready
Truncated SVD is different from regular SVDs in that it produces a factorization where the number of columns is equal to the specified truncation. For example, given an n x n matrix, SVD will produce matrices with n columns, whereas truncated SVD will produce matrices with the specified number of columns. This is how the dimensionality is reduced. Here, we'll again use the iris dataset so that you can compare this outcome against the PCA outcome:
from sklearn.datasets import load_iris iris = load_iris() iris_X = iris.data y = iris.target
How to do it...
This object follows the same...