NumPy Broadcasting in Python

In Python, NumPy Broadcasting describes how arithmetic works between arrays of different shapes. This is a very powerful feature, but one that can be easily misunderstood, even by experienced users. In this article, I will introduce you to NumPy Broadcasting in Python.

NumPy Broadcasting

The simplest example of Numpy Broadcasting occurs when combining a scalar value with an array:

Also, Read – Image Features Extraction with Machine Learning.

import numpy as np
arr = np.arange(5)
print(arr)
print(arr * 4)Code language: PHP (php)
[0 1 2 3 4]
[ 0 4 8 12 16]

We say here that the scalar value 4 has been broadcast to all other elements of the multiplication operation. For example, we can demean each column in a table by subtracting the column means. In this case, it’s very simple:

arr = np.random.randn(4, 3)
print(arr.mean(0))

demeaned = arr - arr.mean(0)
print(demeaned)Code language: PHP (php)

[-0.45050876 -0.10209558 -0.72489256]

[[-0.54139218 -0.50629598 -0.72621479]
[-0.30355515 0.06636598 -0.00474954]
[-0.68277834 0.09940342 0.99876666]
[ 1.52772568 0.34052658 -0.26780233]]

See the image below for an illustration of this operation. Debasing the lines as a broadcast operation requires a little more care. Fortunately, it is possible to broadcast potentially lower-dimensional values ​​to any dimension in an array as long as you follow the rules. This brings us to:

NumPy Broadcasting

Even as an experienced NumPy user, you often have to stop to draw pictures and think about the broadcast rule. Let’s take the last example and suppose we want to subtract the mean value of each row instead.

As arr.mean (0) has length 3, it is compatible for scattering through axis 0 because the end dimension in arr is 3 and therefore matches. According to the rules, to subtract on axis 1 (i.e. subtract the mean of each row), the smallest array must have a form (4, 1):

print(arr)
row_means = arr.mean(1)
print(row_means.reshape((4,1)))Code language: PHP (php)

[[-0.99190094 -0.60839157 -1.45110735]

[-0.7540639 -0.0357296 -0.72964211]
[-1.1332871 -0.00269216 0.2738741 ]
[ 1.07721692 0.238431 -0.9926949 ]]

[[-1.01713329]
[-0.50647854]
[-0.28736839]
[ 0.10765101]]

Has your head exploded again? See the image below for an illustration of this operation.

image for post

NumPy Broadcasting Over Other Axes

The NumPy Broadcasting arrays with high dimensional arrays may look even more complex but you are required to just follow the rules. Otherwise, you will get an error:

 arr - arr.mean(1)Code language: CSS (css)
ValueError Traceback (most recent call last)

It is quite common to want to perform an arithmetic operation with an array of smaller dimension on axes other than the 0 axis. According to the NumPy broadcasting rule, “Broadcasting dimensions” should be 1 in the smallest array. In the row degradation example above, this meant reshaping the row means to format (4, 1) instead of (4,):

demeaned = arr - row_means.reshape((4,1))
print(demeaned.mean(1))

print(arr - arr.mean(1).reshape((4,1)))
Code language: PHP (php)

[ 3.70074342e-17 -7.40148683e-17 0.00000000e+00 0.00000000e+00]

[[ 0.02523235 0.40874172 -0.43397407]
[-0.24758537 0.47074894 -0.22316357]
[-0.84591871 0.28467623 0.56124249]
[ 0.96956591 0.13077999 -1.1003459 ]]

I hope you liked this article on NumPy Broadcasting in Python. Feel free to ask your valuable questions in the comments section below.

Also, Read – A/B Testing in Machine Learning.

Follow Us:

Aman Kharwal
Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of data📈.

Articles: 1537

Leave a Reply