Tuning a decision tree
We will continue to explore the iris dataset further by focusing on the first two features (sepal length and sepal width), optimizing the decision tree, and creating some visualizations.
Getting ready
- Load the iris dataset, focusing on the first two features. Additionally, split the data into training and testing sets:
from sklearn.datasets import load_iris iris = load_iris() X = iris.data[:,:2] y = iris.target from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, stratify=y)
- View the data with pandas:
import pandas as pd pd.DataFrame(X,columns=iris.feature_names[:2])

- Before optimizing the decision tree, let's try a single decision tree with default parameters. Instantiate and train a decision tree:
from sklearn.tree import DecisionTreeClassifier dtc = DecisionTreeClassifier() #Instantiate tree with default parameters dtc.fit(X_train, y_train)
- Measure the accuracy score:
from sklearn.metrics import...