Parallelization in Python – Getting the most out of your CPU

Parallelization is distributing task to different workers (CPU). These workers execute the code together and thus accelerate the algorithm.

For example in a for loop from 1 to 5 with 3 CPU. Each CPU will run the loop but each one at a different iteration.

The first CPU at iteration 1, the second at iteration 2 and the third at iteration 3.

Once a CPU has finished its work, it directly takes the next iteration.

The task is thus parallelized and the algorithm is much faster!

In this article we will parallelize our Python code thanks to the multiprocessing library.

Number of CPU available

Before we can start parallelizing our execution, we need to know how many CPUs we can use.

To do so, we use the cpu_count() function:

import multiprocessing

multiprocessing.cpu_count()

Output : Number of CPUs you have.

Parallelization

Once we know our number of CPUs, we can finally parallelize.

We will use Pool. A Pool is a virtual place where our code will be executed.

Each Pool corresponds to one of our CPUs.

import multiprocessing
from multiprocessing import Pool

def f(x):
  return x

with Pool(processes=multiprocessing.cpu_count()) as pool:
  for i in pool.imap(f, range(10)):
    print(i)

Output : 0 1 2 3 4 5 6 7 8 9

In the above code the number of Pools is determined. Here we give it as value the number of available CPU.

Then we distribute, with the imap() function, to each Pool the task of applying the f function to a list of numbers between 1 and 10.

In fact the work will be distributed to each Pool.

THE PANE METHOD FOR DEEP LEARNING!

Get your 7 DAYS FREE TRAINING to learn how to create your first ARTIFICIAL INTELLIGENCE!

For the next 7 days I will show you how to use Neural Networks.

You will learn what Deep Learning is with concrete examples that will stick in your head.

BEWARE, this email series is not for everyone. If you are the kind of person who likes theoretical and academic courses, you can skip it.

But if you want to learn the PANE method to do Deep Learning, click here :

The first Pool will apply the function f to one of the numbers and at the same time the second one will apply the function f to another one, and so on.

At the end of each task, the Pool applies the function on the next iteration !

A chaque fin de tâche, le Pool applique la fonction sur l’itération suivante !

Is it really working ?

But then how do we know that parallelization really works ?

After all, the numbers in the code output are arranged in ascending order. Maybe the execution is just not parallelized ?

Well, to find out we’ll use the imap_unordered() function.

This function will distribute the tasks in an unordered way. So that we can see that CPUs are working in parallel :

import multiprocessing
from multiprocessing import Pool

def f(x):
  return x

with Pool(processes=4) as pool:
  for i in pool.imap_unordered(f, range(10)):
    print(i)

Output : 0 2 3 1 6 7 8 9 5 4

Here, we can see that the output is not hierarchical. This means that the code has not been executed classically but in parallel !

Parallelization is a technique to be used only in peculiar cases.

Indeed, in a task as simple as our f function, parallelization is not useful.

On the contrary, it takes more time than a classical execution because the time to create the pools is considerable.

However, it appears to be efficient in many cases where the calculation to be performed is complex.

To determine if you need parallelization, don’t hesitate to refer to our article on the execution time of your algos !

sources :

THE PANE METHOD FOR DEEP LEARNING!

Get your 7 DAYS FREE TRAINING to learn how to create your first ARTIFICIAL INTELLIGENCE!

For the next 7 days I will show you how to use Neural Networks.

You will learn what Deep Learning is with concrete examples that will stick in your head.

BEWARE, this email series is not for everyone. If you are the kind of person who likes theoretical and academic courses, you can skip it.

But if you want to learn the PANE method to do Deep Learning, click here :

Tom Keldenich
Tom Keldenich

Data Engineer & passionate about Artificial Intelligence !

Founder of the website Inside Machine Learning

Leave a Reply

Your email address will not be published. Required fields are marked *

This page will not stay online forever

Enter your email to receive for free

The PANE method for Deep Learning

* indicates required

 

You will receive one email per day for 7 days – then you will receive my newsletter.
Your information will never be given to third parties.

You can unsubscribe in 1 click from any of my emails.

Cette page ne restera pas en ligne éternellement


Entre ton email pour recevoir gratuitement
la méthode PARÉ pour faire du Deep Learning


Tu recevras un email par jour pendant 7 jours - puis tu recevras ma newsletter.
Tes informations ne seront jamais cédées à des tiers.

Tu peux te désinscrire en 1 clic depuis n'importe lequel de mes emails.