Parallelization in Python – Getting the most out of your CPU

Parallelization is distributing task to different workers (CPU). These workers execute the code together and thus accelerate the algorithm.

For example in a for loop from 1 to 5 with 3 CPU. Each CPU will run the loop but each one at a different iteration.

The first CPU at iteration 1, the second at iteration 2 and the third at iteration 3.

Once a CPU has finished its work, it directly takes the next iteration.

The task is thus parallelized and the algorithm is much faster!

In this article we will parallelize our Python code thanks to the multiprocessing library.

Number of CPU available

Before we can start parallelizing our execution, we need to know how many CPUs we can use.

To do so, we use the cpu_count() function:

import multiprocessing

multiprocessing.cpu_count()

Output : Number of CPUs you have.

Parallelization

Once we know our number of CPUs, we can finally parallelize.

We will use Pool. A Pool is a virtual place where our code will be executed.

Each Pool corresponds to one of our CPUs.

import multiprocessing
from multiprocessing import Pool

def f(x):
  return x

with Pool(processes=multiprocessing.cpu_count()) as pool:
  for i in pool.imap(f, range(10)):
    print(i)

Output : 0 1 2 3 4 5 6 7 8 9

In the above code the number of Pools is determined. Here we give it as value the number of available CPU.

Then we distribute, with the imap() function, to each Pool the task of applying the f function to a list of numbers between 1 and 10.

In fact the work will be distributed to each Pool.

By the way, if your goal is to master Deep Learning - I've prepared the Action plan to Master Neural networks. for you.

7 days of free advice from an Artificial Intelligence engineer to learn how to master neural networks from scratch:

  • Plan your training
  • Structure your projects
  • Develop your Artificial Intelligence algorithms

I have based this program on scientific facts, on approaches proven by researchers, but also on my own techniques, which I have devised as I have gained experience in the field of Deep Learning.

To access it, click here :

GET MY ACTION PLAN

GET MY ACTION PLAN

Now we can get back to what I was talking about earlier.

The first Pool will apply the function f to one of the numbers and at the same time the second one will apply the function f to another one, and so on.

At the end of each task, the Pool applies the function on the next iteration !

A chaque fin de tâche, le Pool applique la fonction sur l’itération suivante !

Is it really working ?

But then how do we know that parallelization really works ?

After all, the numbers in the code output are arranged in ascending order. Maybe the execution is just not parallelized ?

Well, to find out we’ll use the imap_unordered() function.

This function will distribute the tasks in an unordered way. So that we can see that CPUs are working in parallel :

import multiprocessing
from multiprocessing import Pool

def f(x):
  return x

with Pool(processes=4) as pool:
  for i in pool.imap_unordered(f, range(10)):
    print(i)

Output : 0 2 3 1 6 7 8 9 5 4

Here, we can see that the output is not hierarchical. This means that the code has not been executed classically but in parallel !

Parallelization is a technique to be used only in peculiar cases.

Indeed, in a task as simple as our f function, parallelization is not useful.

On the contrary, it takes more time than a classical execution because the time to create the pools is considerable.

However, it appears to be efficient in many cases where the calculation to be performed is complex.

To determine if you need parallelization, don’t hesitate to refer to our article on the execution time of your algos !

sources :

One last word, if you want to go further and learn about Deep Learning - I've prepared for you the Action plan to Master Neural networks. for you.

7 days of free advice from an Artificial Intelligence engineer to learn how to master neural networks from scratch:

  • Plan your training
  • Structure your projects
  • Develop your Artificial Intelligence algorithms

I have based this program on scientific facts, on approaches proven by researchers, but also on my own techniques, which I have devised as I have gained experience in the field of Deep Learning.

To access it, click here :

GET MY ACTION PLAN

GET MY ACTION PLAN

Tom Keldenich
Tom Keldenich

Artificial Intelligence engineer and data enthusiast!

Founder of the website Inside Machine Learning

Leave a Reply

Your email address will not be published. Required fields are marked *

This page will not stay online forever

Enter your email to receive for free

The PANE method for Deep Learning

* indicates required

 

You will receive one email per day for 7 days – then you will receive my newsletter.
Your information will never be given to third parties.

You can unsubscribe in 1 click from any of my emails.



Entre ton email pour recevoir gratuitement
la méthode PARÉ pour faire du Deep Learning


Tu recevras un email par jour pendant 7 jours - puis tu recevras ma newsletter.
Tes informations ne seront jamais cédées à des tiers.

Tu peux te désinscrire en 1 clic depuis n'importe lequel de mes emails.