Least-squares curves with NumPy and SciPy
We will now learn how to fit curves to a dataset. For this section, we will investigate the relationship between horsepower and mpg for a vehicle. From Figure 10.1, we know that the relationship between these two variables is not linear; hence, we will use power 2 of our feature variable X as an input to the model. This is called polynomial regression. Here, we are using a linear model to fit a non-linear dataset.
Here's how we will import the required Python packages and select the X and Y of interest from the pandas data frame, df:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
#Importing the dataset as a pandas dataframe
df = pd.read_csv("auto_dataset.csv")
#Selecting the variables of interest
X = df["horsepower"]
y = df["mpg"]
#Converting the series...