Plotting correlation between price and other features
Now that the initial exploratory analysis is done, we have a better idea of how the different variables are contributing to the price of each house. However, we have no idea of the importance of each variable when it comes to predicting prices. Since we have 21 variables, it becomes difficult to build models by incorporating all variables in one single model. Therefore, some variables may need to be discarded or neglected if they have lesser significance than other variables.
Getting ready
Correlation coefficients are used in statistics to measure how strong the relationship is between two variables. In particular, Pearson's correlation coefficient is the most commonly used coefficient while performing linear regression. The correlation coefficient usually takes on a value between -1 and +1:
- A correlation coefficient of 1 means that for every positive increase in one variable, there is a positive increase of a fixed proportion in the other...