Using ridge regression to overcome linear regression's shortfalls
In this recipe, we'll learn about ridge regression. It is different from vanilla linear regression; it introduces a regularization parameter to shrink coefficients. This is useful when the dataset has collinear factors.
Note
Ridge regression is actually so powerful in the presence of collinearity that you can model polynomial features: vectors x, x2, x3, ... which are highly collinear and correlated.
Getting ready
Let's load a dataset that has a low effective rank and compare ridge regression with linear regression by way of the coefficients. If you're not familiar with rank, it's the smaller of the linearly independent columns and the linearly independent rows. One of the assumptions of linear regression is that the data matrix is full rank.
How to do it...
- First, use
make_regression
to create a simple dataset with three predictors, but aneffective_rank
of2
. Effective rank means that, although technically the matrix is full rank...