R-squared (or coefficient of determination) is a statistical model evaluation measure that assesses the goodness of a regression model. It helps data analysts to explain model performance compared to the base model. Its value lies between 0 and 1. A value near 0 represents a poor model while a value near 1 represents a perfect fit. Sometimes, R-squared results in a negative value. This means your model is worse than the average base model. We can explain R-squared using the following formula:
Let's understand all the components one by one:
- Sum of Squares Regression (SSR): This estimates the difference between the forecasted value and the mean of the data.
- Sum of Squared Errors (SSE): This estimates the change between the original or genuine value and the forecasted value.
- Total Sum of Squares (SST): This is the change between the original or genuine value and the mean of the data.