Decision trees - Predicting hiring decisions using Python
Turns out that it's easy to make decision trees; in fact it's crazy just how easy it is, with just a few lines of Python code. So let's give it a try.
I've included a PastHires.csv
file with your book materials, and that just includes some fabricated data, that I made up, about people that either got a job offer or not based on the attributes of those candidates.
import numpy as np import pandas as pd from sklearn import tree input_file = "c:/spark/DataScience/PastHires.csv" df = pd.read_csv(input_file, header = 0)
You'll want to please immediately change that path I used here for my own system (c:/spark/DataScience/PastHires.csv
) to wherever you have installed the materials for this book. I'm not sure where you put it, but it's almost certainly not there.
We will use pandas
to read our CSV in, and create a DataFrame object out of it. Let's go ahead and run our code, and we can use the head()
function on the DataFrame to print...