Packt+ | Advance your knowledge in tech

0

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Newsletter Hub

Free Learning

Python Data Analysis, Second Edition

You're reading from Python Data Analysis, Second Edition Data manipulation and complex data analysis with Python

Product type Paperback

Published in Mar 2017

Publisher Packt

ISBN-13 9781787127487

Length 330 pages

Edition 2nd Edition

Languages

Python

Tools

Scikit-learn

Concepts

Data Analysis

Author (1):

Armando Fandango

View More author details

Table of Contents (22) Chapters

Python Data Analysis - Second Edition

Credits

About the Author

About the Reviewers

www.PacktPub.com

Customer Feedback

Preface

1. Getting Started with Python Libraries FREE CHAPTER

2. NumPy Arrays

3. The Pandas Primer

4. Statistics and Linear Algebra

5. Retrieving, Processing, and Storing Data

6. Data Visualization

7. Signal Processing and Time Series

8. Working with Databases

9. Analyzing Textual Data and Social Media

10. Predictive Analytics and Machine Learning

11. Environments Outside the Python Ecosystem and Cloud Computing

12. Performance Tuning, Profiling, and Concurrency

Key Concepts

Useful Functions

Online Resources

Pivot tables

A pivot table, as used in Excel, summarizes data. So far, the data in CSV files that we have seen in this chapter has been in flat files. The pivot table aggregates data from a flat file for certain columns and rows. The aggregating operation can be sum, mean, standard deviations, and so on. We will reuse the data-generating code from ch-03.ipynb. The Pandas API has a top-level pivot_table() function and a corresponding DataFrame method. With the aggfunc parameter, we can specify the aggregation function to, say, use the NumPy sum() function. The cols parameter tells Pandas the column to be aggregated. Create a pivot table on the Food column as follows:

print(pd.pivot_table(df, cols=['Food'], aggfunc=np.sum))

The pivot table we get contains totals for each food item:

Food    chocolate   icecream      soup
Number   8.000000  15.000000  19.00000
Price    5.986585  10.440071  13.83338

[2 rows x 3 columns]

The preceding code can be found in ch-03.ipynb in...

The rest of the chapter is locked

Register for a free Packt account to unlock a world of extra content!

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (1)

Armando Fandango

Armando Fandango

Armando Fandango is an accomplished technologist with hands-on capabilities and senior executive level experience with startups and large companies globally. Armando is spearheading Epic Engineering and Consulting Group as Chief Data Scientist. His work spans across diverse industries including FinTech, Banking, BioInformatics, Genomics, AdTech, Utilities and Infrastructure, Traffic and Transportation, Energy, Human Resource, and Entertainment. Armando has worked for more than ten years in projects involving Predictive Analytics, Data Science, Machine Learning, Big Data, Product Engineering and High-Performance Computing. His research interests span across machine learning, deep learning, algorithmic game theory and scientific computing. Armando has authored book titled “Python Data Analysis - Second Edition” and published research in international journals and conferences.

See other products by Armando Fandango

Other recommended products

Related to this chapter

Python Data Analysis

Python Data Analysis

This book takes a practical approach to Python data analysis, showing you how to use Python libraries such as pandas, NumPy, SciPy, and scikit-learn to analyze a variety of data. You'll also get up to speed with everything from data manipulation to visualization systematically.

Feb 2021 15h 56m

Learning pandas

Learning pandas

Pandas is a popular Python package used for practical, real world data analysis. It provides efficient fast, high-performance data structures that makes data exploration and analysis very easy. This learner's guide will help you through a comprehensive set of features provided by the pandas library to perform efficient data manipulation and analysis.

Jun 2017 14h 52m

Hands-On Python for Finance

Hands-On Python for Finance

With this book, you will learn and implement various Quantitative Finance concepts using popular Python libraries like Numpy, pandas, Keras and more. We provide techniques to apply statistical methods used for data preprocessing and predict some of the best real-world case scenarios like stock prediction, sales prediction and many examples as such.

Mar 2019 12h 36m

Hands-On Financial Trading with Python

Hands-On Financial Trading with Python

This book focuses on key Python analytics and algorithmic trading libraries used for backtesting. With the help of practical examples, you will learn the principle aspects of trading strategy development. The 14 profitable strategies included in the book will also help you build intuitions that will enable you to create your own strategy.

Apr 2021 12h 0m

Mastering Numerical Computing with NumPy

Mastering Numerical Computing with NumPy

Mastering Numerical Computing with Python guides you in performing complex computing with cutting-edge coverage on advanced concepts such as exploratory data analysis and clustering algorithms. You'll become an expert in addressing matrix calculations, and write efficient NumPy codes for implementing algorithms with real-world examples.

Jun 2018 8h 16m

The SciPy stack is a popular Python ecosystem used for mathematical and scientific computing tasks. Learn how you can put to use the various functionalities offered by the SciPy stack in the most efficient way possible. With the help of this book, you will solve real-world problems in linear algebra, numerical analysis, visualization, and more.

Dec 2017 12h 52m

scikit-learn Cookbook

scikit-learn Cookbook

scikit-learn has evolved as a robust library for machine learning applications in python with support for a wide range of supervised and unsupervised learning algorithms. This edition brings to you the various enhancements to its model implementations, API and bug fixes in the latest major release of scikit-learn to support Python. This book covers easy to follow recipes right from mathematical operations to implementing various supervised, unsupervised and deep learning algorithms with scikit-learn. Get practical hands-on knowledge to implement various models and algorithms like Multi-Layer Perceptrons, time-series split, MAE criterion for regression, criteria for gradient boosting, Classifier, Regressor, and much more.

Nov 2017 12h 28m

Python High Performance

Python High Performance

Python is a versatile language that has found applications in many industries. The clean syntax, rich standard library, and vast selection of third-party libraries makes Python a wildly popular language.

Mastering pandas

Mastering pandas

pandas is a popular Python library used by data scientists and analysts worldwide to manipulate and analyze their data. This book presents useful techniques and real-world examples on getting the most out of pandas for expert-level data manipulation, analysis and visualization.

Oct 2019 22h 28m

Applying Math with Python

Applying Math with Python

Python has a number of powerful packages to help anyone tackle complex mathematical problems in a simple and efficient way. This practical guide explains how to model real-world problems as mathematical objects in Python and how to perform computations, and interpret results. It explores Python lang to solve a variety of math and statistics problems.

Jul 2020 11h 56m

Cleaning Data for Effective Data Science

Cleaning Data for Effective Data Science

Data in its raw state is rarely ready for productive analysis. This book not only teaches you data preparation, but also what questions you should ask of your data. It focuses on the thought processes necessary for successful data cleaning as much as on concise and precise code examples that express these thoughts.

Mar 2021 16h 36m

IPython Interactive Computing and Visualization Cookbook

IPython Interactive Computing and Visualization Cookbook

IPython Interactive Computing and Visualization Cookbook, Second Edition shows you how to analyze and visualize data in the Jupyter Notebook. It will help you become an expert in high-performance computing and visualization for data analysis and scientific modeling.

Jan 2018 18h 16m

Personalised recommendations for you

Based on your interests and search pattern

Mathematics of Machine Learning

Mathematics of Machine Learning

Deepen your theoretical knowledge and enhance your ability to solve complex machine learning problems with structured guidance. Gain the confidence to engage with advanced ML literature and tailor algorithms to meet your project requirements.

May 2025 24h 20m

Generative AI with Python and PyTorch

Generative AI with Python and PyTorch

Learn how to create images and text using VAEs, GANs, LSTMs, and transformers. Implement applications in natural language processing and computer vision through practical tutorials.

Mar 2025 15h 8m

Practical Generative AI with ChatGPT

Practical Generative AI with ChatGPT

This book helps you unlock ChatGPT's potential to make your working life better. From prompt engineering to creating custom GPTs, you'll enhance your productivity, creativity, and efficiency with practical insights and advanced techniques.

Apr 2025 13h 12m

Generative AI with LangChain

Generative AI with LangChain

Gain a solid foundation in LangChain, agentic AI, and LangGraph, and learn to build production-ready systems with multi-agent architectures, advanced RAG pipelines, Tree of Thought reasoning, agent handoffs, and fine-grained error handling.

May 2025 16h 8m

Architecting Power BI Solutions in Microsoft Fabric

Architecting Power BI Solutions in Microsoft Fabric

Power BI provides several options to solve common data problems, and designing the correct solution for each scenario can be a daunting task. This book makes it easier by guiding you through designing optimal solutions using Power BI.

Apr 2025 14h 24m

Microsoft Identity and Access Administrator SC-300 Exam Guide

Microsoft Identity and Access Administrator SC-300 Exam Guide

This comprehensive guide covers key topics such as Microsoft Entra ID implementation, authentication and access management, external user management, and hybrid identity solutions, providing practical insights and techniques for SC-300 exam success.

Mar 2025 19h 48m

LLM Design Patterns

LLM Design Patterns

This book helps you gain practical skills to develop and deploy LLMs. You'll learn data prep, training, pruning, quantization, and evaluation, as well as explore RAG, advanced prompting, and optimization to build robust, scalable language models.

May 2025 17h 56m

Tableau Cookbook for Experienced Professionals

Tableau Cookbook for Experienced Professionals

Advance your Tableau knowledge beyond the basics, streamline dashboard performance, tackle advanced geospatial challenges, and unlock API potential while fortifying your corporate data infrastructure with proven best practices.

Apr 2025 12h 24m

Time Series Analysis with Spark

Time Series Analysis with Spark

This book offers a complete guide to time series analysis with Apache Spark and Databricks, covering essential concepts and advanced techniques including Generative AI to equip readers with skills for real-world challenges across industries.

Mar 2025 10h 4m

Hands-On Artificial Intelligence for IoT

Hands-On Artificial Intelligence for IoT

Transform IoT systems with the power of artificial intelligence using this hands-on guide. Dive into practical techniques and expert insights to innovate and optimize your IoT devices, making them smarter and more efficient.

May 2025 15h 52m