Thursday, October 8, 2020

Array-Oriented Programming with NumPy- 6 (NumPy Calculation Methods)


An array has various methods that perform calculations using its contents. By default, these methods ignore the array’s shape and use all the elements in the calculations. For example, calculating the mean of an array totals all of its elements regardless of its shape, then divides by the total number of elements. You can perform these calculations on each dimension as well. For example, in a two-dimensional array, you can calculate each row’s mean and each column’s mean. Consider an array representing four students’ grades on three exams:

In [1]: import numpy as np
In [2]: grades = np.array([[87, 96, 70], [100,87, 90],
...: [94, 77, 90], [100, 81, 82]])
...: 

In [3]: grades
Out[3]:
array([[ 87, 96, 70],
[100, 87, 90],
[ 94, 77, 90],
[100, 81, 82]])

We can use methods to calculate sum, min, max, mean, std (standard deviation) and var (variance)—each is a functional style programming reduction:

In [4]: grades.sum()
Out[4]: 1054

In [5]: grades.min()
Out[5]: 70

In [6]: grades.max()
Out[6]: 100

In [7]: grades.mean()
Out[7]: 87.83333333333333

In [8]: grades.std()
Out[8]: 8.792357792739987

In [9]: grades.var()
Out[9]: 77.30555555555556

Calculations by Row or Column

Many calculation methods can be performed on specific array dimensions, known as the array’s axes. These methods receive an axis keyword argument that specifies which dimension to use in the calculation, giving you a quick way to perform calculations by row or column in a two dimensional
array.

Assume that you want to calculate the average grade on each exam, represented by the columns of grades. Specifying axis=0 performs the calculation on all the row values within each column:

In [10]: grades.mean(axis=0)
Out[10]: array([95.25, 85.25, 83. ])

So 95.25 above is the average of the first column’s grades (87, 100, 94 and 100), 85.25 is the average of the second column’s grades (96, 87, 77 and 81) and 83 is the average of the third column’s grades (70, 90, 90 and 82). Again, NumPy does not display trailing 0s to the right of the decimal point in '83.'. Also note that it does display all element values in the same field width, which is why '83.' is followed by two spaces.

Similarly, specifying axis=1 performs the calculation on all the column values within each individual row. To calculate each student’s average grade for all exams, we can use:

In [11]: grades.mean(axis=1)
Out[11]: array([84.33333333, 92.33333333, 87., 87.66666667])

This produces four averages—one each for the values in each row. So 84.33333333 is the average of row 0’s grades (87, 96 and 70), and the other averages are for the remaining rows. NumPy arrays have many more calculation methods. For the complete list, see https://docs.scipy.org/doc/numpy/reference/
arrays.ndarray.html

Share:

0 comments:

Post a Comment