Computing the summary statistics of a dataset
Obtaining the summary statistics of a dataset helps us to understand quite a few attributes of the dataset:
- Number of observations in the dataset
- Minimum value and maximum value
- Variance of the dataset
- Mean values in the dataset
- Skewness of the dataset
- Kurtosis of the dataset
How to do it...
The summary statistics of a dataset can be obtained using the describe
function within scipy.stats
.
The process to obtain the summary statistics of a dataset is as follows:
- Import the relevant packages:
from scipy import stats
- Initialize an array:
a = np.arange(10)
In the preceding code, we have initialized a one-dimensional array.
- Fetch the summary statistics of the dataset (array):
stats.describe(a) DescribeResult(nobs=10, minmax=(0, 9), mean=4.5, variance=9.1666666666666661, skewness=0.0, kurtosis=-1.2242424242424244)
Note that the result of the describe
function is all the summary statistics of the dataset.
The preceding output is for a one-dimensional dataset...