Packt+ | Advance your knowledge in tech

You're reading from IPython Interactive Computing and Visualization Cookbook Over 100 hands-on recipes to sharpen your skills in high-performance numerical computing and data science in the Jupyter Notebook

Product type Paperback

Published in Jan 2018

Publisher Packt

ISBN-13 9781785888632

Length 548 pages

Edition 2nd Edition

Languages

Python

Tools

IPython

Concepts

Data Analysis

Author (1):

Cyrille Rossant

View More author details

Table of Contents (19) Chapters

IPython Interactive Computing and Visualization CookbookSecond Edition

Contributors

Preface

1. A Tour of Interactive Computing with Jupyter and IPython FREE CHAPTER

2. Best Practices in Interactive Computing

3. Mastering the Jupyter Notebook

4. Profiling and Optimization

5. High-Performance Computing

6. Data Visualization

7. Statistical Data Analysis

8. Machine Learning

9. Numerical Optimization

10. Signal Processing

11. Image and Audio Processing

12. Deterministic Dynamical Systems

13. Stochastic Dynamical Systems

14. Graphs, Geometry, and Geographic Information Systems

15. Symbolic and Numerical Mathematics

Index

Performing out-of-core computations on large arrays with Dask

Dask is a parallel computing library that offers not only a general framework for distributing complex computations on many nodes, but also a set of convenient high-level APIs to deal with out-of-core computations on large arrays. Dask provides data structures resembling NumPy arrays (dask.array) and Pandas DataFrames (dask.dataframe) that efficiently scale to huge datasets. The core idea of Dask is to split a large array into smaller arrays (chunks).

In this recipe, we illustrate the basic principles of dask.array.

Getting ready

Dask should already be installed in Anaconda, but you can always install it manually with conda install dask. You also need memory_profiler, which you can install with conda install memory_profiler.

How to do it...

Let's import the libraries:

>>> import numpy as np
    import dask.array as da
    import memory_profiler
>>> %load_ext memory_profiler

We initialize a large 10,000 x 10,000 array...

The rest of the chapter is locked