Regularization and classification
The regularization techniques applied above will also work for classification problems, both binomial and multinomial. Therefore, let's not conclude this chapter until we apply some sample code on a logistic regression problem, specifically the breast cancer data from the prior chapter. As in regression with a quantitative response, this can be an important technique to utilize data sets with high dimensionality.
Logistic regression example
Recall that, in the breast cancer data we analyzed, the probability of a tumor being malignant can be denoted as follows in a logistic function:
P(malignant) = 1 / 1 + e-(B0 + B1X1 + BnXn)
Since we have a linear component in the function, L1 and L2 regularization can be applied. To demonstrate this, let's load and prepare the breast cancer data like we did in the previous chapter:
> library(MASS) > biopsy$ID = NULL > names(biopsy) = c("thick", "u.size", "u.shape", "adhsn", "s.size", "nucl", "chrom...