In Python, NumPy Broadcasting describes how arithmetic works between arrays of different shapes. This is a very powerful feature, but one that can be easily misunderstood, even by experienced users. In this article, I will introduce you to NumPy Broadcasting in Python.
The simplest example of Numpy Broadcasting occurs when combining a scalar value with an array:
import numpy as np arr = np.arange(5) print(arr) print(arr * 4)
[0 1 2 3 4]
[ 0 4 8 12 16]
We say here that the scalar value 4 has been broadcast to all other elements of the multiplication operation. For example, we can demean each column in a table by subtracting the column means. In this case, it’s very simple:
arr = np.random.randn(4, 3) print(arr.mean(0)) demeaned = arr - arr.mean(0) print(demeaned)
[-0.45050876 -0.10209558 -0.72489256]
[[-0.54139218 -0.50629598 -0.72621479]
[-0.30355515 0.06636598 -0.00474954]
[-0.68277834 0.09940342 0.99876666]
[ 1.52772568 0.34052658 -0.26780233]]
See the image below for an illustration of this operation. Debasing the lines as a broadcast operation requires a little more care. Fortunately, it is possible to broadcast potentially lower-dimensional values to any dimension in an array as long as you follow the rules. This brings us to:
Even as an experienced NumPy user, you often have to stop to draw pictures and think about the broadcast rule. Let’s take the last example and suppose we want to subtract the mean value of each row instead.
As arr.mean (0) has length 3, it is compatible for scattering through axis 0 because the end dimension in arr is 3 and therefore matches. According to the rules, to subtract on axis 1 (i.e. subtract the mean of each row), the smallest array must have a form (4, 1):
print(arr) row_means = arr.mean(1) print(row_means.reshape((4,1)))
[[-0.99190094 -0.60839157 -1.45110735]
[-0.7540639 -0.0357296 -0.72964211]
[-1.1332871 -0.00269216 0.2738741 ]
[ 1.07721692 0.238431 -0.9926949 ]]
Has your head exploded again? See the image below for an illustration of this operation.
NumPy Broadcasting Over Other Axes
The NumPy Broadcasting arrays with high dimensional arrays may look even more complex but you are required to just follow the rules. Otherwise, you will get an error:
arr - arr.mean(1)
ValueError Traceback (most recent call last)
It is quite common to want to perform an arithmetic operation with an array of smaller dimension on axes other than the 0 axis. According to the NumPy broadcasting rule, “Broadcasting dimensions” should be 1 in the smallest array. In the row degradation example above, this meant reshaping the row means to format (4, 1) instead of (4,):
demeaned = arr - row_means.reshape((4,1)) print(demeaned.mean(1)) print(arr - arr.mean(1).reshape((4,1)))
[ 3.70074342e-17 -7.40148683e-17 0.00000000e+00 0.00000000e+00]
[[ 0.02523235 0.40874172 -0.43397407]
[-0.24758537 0.47074894 -0.22316357]
[-0.84591871 0.28467623 0.56124249]
[ 0.96956591 0.13077999 -1.1003459 ]]
I hope you liked this article on NumPy Broadcasting in Python. Feel free to ask your valuable questions in the comments section below.