Multicollinearity can be detected using the following:
- The correlation coefficient (or correlation matrix) between independent variables
- Variance Inflation Factor (VIF)
- Eigenvalues
Correlation coefficients or correlation matrices will help us to identify a high correlation between independent variables. Using the correlation coefficient, we can easily detect the multicollinearity by checking the correlation coefficient magnitude:
# Import pandas import pandas as pd
# Read the blood pressure dataset data = pd.read_csv("bloodpress.txt",sep='\t')
# See the top records in the data data.head()
This results in the following output:
In the preceding code block, we read the bloodpress.txt data using the read_csv() function. We also checked the initial records of the dataset. This dataset has BP, Age, Weight, BSA, Dur, Pulse, and Stress fields. Let's check the multicollinearity in the dataset using the correlation matrix:
# Import seaborn...