Chapter 4. Representing Tabular and Multivariate Data with the DataFrame
The pandas DataFrame
object extends the capabilities of the Series
object into two-dimensions. Instead of a single series of values, each row of a data frame can have multiple values, each of which is represented as a column. Each row of a data frame can then model multiple related properties of a subject under observation, and with each column being able to represent different types of data.
Each column of a data frame is a pandas Series
, and a data frame can be considered a form of data like a spreadsheet or a database table. But these comparisons do not do the DataFrame
justice, as a data frame has very distinct qualities specific to pandas, such as automatic data alignment of the Series
objects that represent the columns.
This automatic alignment makes a data frame much more capable of exploratory data analysis than spreadsheets or databases. Combined with the ability to slice data simultaneously across both rows...