Parallelization in Python – Getting the most out of your CPU

Parallelization is distributing task to different workers (CPU). These workers execute the code together and thus accelerate the algorithm.

For example in a for loop from 1 to 5 with 3 CPU. Each CPU will run the loop but each one at a different iteration.

The first CPU at iteration 1, the second at iteration 2 and the third at iteration 3.

Once a CPU has finished its work, it directly takes the next iteration.

The task is thus parallelized and the algorithm is much faster!

In this article we will parallelize our Python code thanks to the multiprocessing library.

Number of CPU available

Before we can start parallelizing our execution, we need to know how many CPUs we can use.

To do so, we use the cpu_count() function:

import multiprocessing


Output : Number of CPUs you have.


Once we know our number of CPUs, we can finally parallelize.

We will use Pool. A Pool is a virtual place where our code will be executed.

Each Pool corresponds to one of our CPUs.

import multiprocessing
from multiprocessing import Pool

def f(x):
  return x

with Pool(processes=multiprocessing.cpu_count()) as pool:
  for i in pool.imap(f, range(10)):

Output : 0 1 2 3 4 5 6 7 8 9

In the above code the number of Pools is determined. Here we give it as value the number of available CPU.

Then we distribute, with the imap() function, to each Pool the task of applying the f function to a list of numbers between 1 and 10.

In fact the work will be distributed to each Pool.

The first Pool will apply the function f to one of the numbers and at the same time the second one will apply the function f to another one, and so on.

At the end of each task, the Pool applies the function on the next iteration !

A chaque fin de tâche, le Pool applique la fonction sur l’itération suivante !

Is it really working ?

But then how do we know that parallelization really works ?

After all, the numbers in the code output are arranged in ascending order. Maybe the execution is just not parallelized ?

Well, to find out we’ll use the imap_unordered() function.

This function will distribute the tasks in an unordered way. So that we can see that CPUs are working in parallel :

import multiprocessing
from multiprocessing import Pool

def f(x):
  return x

with Pool(processes=4) as pool:
  for i in pool.imap_unordered(f, range(10)):

Output : 0 2 3 1 6 7 8 9 5 4

Here, we can see that the output is not hierarchical. This means that the code has not been executed classically but in parallel !

Parallelization is a technique to be used only in peculiar cases.

Indeed, in a task as simple as our f function, parallelization is not useful.

On the contrary, it takes more time than a classical execution because the time to create the pools is considerable.

However, it appears to be efficient in many cases where the calculation to be performed is complex.

To determine if you need parallelization, don’t hesitate to refer to our article on the execution time of your algos !

sources :

Tom Keldenich
Tom Keldenich

Data Engineer & passionate about Artificial Intelligence !

Founder of the website Inside Machine Learning

Leave a Reply

Your email address will not be published.

Beginner, expert or just curious?Discover our latest news and articles on Machine Learning

Explore Machine Learning, browse our most recent notebooks and stay up to date with the latest practices and technologies!