Introduction to pandas
Almost all of the features we've discussed so far are features ofbasePython; that is, no external packages or libraries were required. The truth of the matter is that the majority of the code we write in this book will pertain to one of severalexternalPython packages commonly used for analytics. The pandas library (http://pandas.pydata.org) is an integral part of the later programming chapters. The functions of pandas for machine learning are threefold:
- Import data from flat files into your Python session
- Wrangle, manipulate, format, and cleanse data using the pandas DataFrame and its library of functions
- Export data from your Python session to flat files
Let's review each of these functions in turn.
Flat files are popular methods of storing healthcare-related data (along with HL7 formats, which are not covered in this book). A flat file is a text file representation of data. Using flat files, data can be represented as rows and columns, similar to databases, except that...