Data exploration and linear regression in practice
In this section, we will start using one of the most well-known toy datasets, explore it, and select one of the dimensions to learn how to build a linear regression model for its values.
Let's start by importing all the libraries (scikit-learn
, seaborn
, and matplotlib
); one of the excellent features of Seaborn is its ability to define very professional-looking style settings. In this case, we will use the whitegrid
style:
import numpy as np from sklearn import datasets import seaborn.apionly as sns %matplotlib inline import matplotlib.pyplot as plt sns.set(style='whitegrid', context='notebook')
The Iris dataset
It’s time to load the Iris dataset. This is one of the most well-known historical datasets. You will find it in many books and publications. Given the good properties of the data, it is useful for classification and regression examples. The Iris dataset (https://archive.ics.uci.edu/ml/datasets/Iris) contains 50 records for each of the...