Loading data from the UCI repository
The first dataset we will load is the Pima Indians diabetes dataset. This will require access to the internet. The dataset is available thanks to Sigillito V. (1990), UCI machine learning repository (https://archive.ics.uci.edu/ml/machine-learning-databases/pima-indians-diabetes/pima-indians-diabetes.data), Laurel, MD at Johns Hopkins University, applied physics laboratory.
Note
The first thing in your mind if you are an open source veteran is, what is the license/permission to this database? This is a very important issue. The UCI repository has a use policy that requires citation of the database whenever we are using it. We are allowed to use it but we must give them proper credit for their great help and provide a citation.
How to do it...
- Go to IPython and import
pandas
:
import pandas as pd
- Type the web location of the Pima Indians diabetes dataset as a string as follows:
data_web_address = "https://archive.ics.uci.edu/ml/machine-learning-databases/pima...