To ensure the high performance of an application, programmers need to write efficient code. Code efficiency is directly linked with algorithmic efficiency and the speed of runtime execution for software.

The 🐍 Python language is a 🐌 slow language - compared to C or FORTRAN and to combat this, various methods have been developed to help speed up the programming language.

One method is CONCURRENCY.

CONCURRENCY 🤷‍♂️

Concurrency is when two tasks can start, run, and complete in overlapping time periods. It doesn't necessarily mean they'll ever both be running at the same instant(that the becomes Parallelism). Still confused 😕? Let's paint a scenario:

We are planning a dream wedding for a particular couple. At our disposal, we have Mary, Susan, Max, and Simon. A cake, band, decorations, and invitation cards are needed for the dream wedding. We assign the baking of the cake to Susan, the hiring of a band to Simon, the setting up of decorations to Mary and the sending of invitations to Max.

These four friends(or processors) all perform their tasks(or processes) at the same time without task switching or interruptions until it is completed. This is - in layman's term referred to as CONCURRENCY.

TYPES OF CONCURRENCY IN PYTHON 🐍

The basic types of concurrency in Python include:-

multi-threading 🧵
multiprocessing 🧩(yeah, I know that's a jigsaw piece 😏)
asyncio ⏰ (more on this in another tutorial)

MULTI-THREADING 🧵

A Thread is the smallest unit of execution in an operating system. Threads are a way for a program to split itself into two or more simultaneously running tasks. A thread is not itself a program, it only runs within a particular program or process.

Multi-threading is game-changing and is mainly used for I/O bound operations. It is the ability of a central processing unit (CPU) (or a single core in a multi-core processor) to provide multiple threads of execution concurrently, supported by the operating system. Each thread shares the same resource supplied by the process it exists in.

Let's illustrate an example of a multi-threaded operation. First, we take a look at the synchronous process.

Synchronous Process

# A simple python script that gets query a list of site
import requests
import time


def get_single_site(url):
    with requests.get(url) as response:
        print(f"Read {len(response.content)} from {url}")


def get_all_sites(sites):
    for url in sites:
        get_single_site(url, session)


if __name__ == "__main__":
    start_time = time.time()
    urls = [
        "https://www.google.com",
        "https://www.facebook.com",
        "https://www.twitter.com/theghostyced"
    ] * 30

    get_all_sites(urls)
    end_time = time.time() - start_time

    print(f"Downloaded {len(sites)} in {end_time} seconds")

This is a simple python program that downloads the content of the provided websites. After downloading the content of the website, it then prints out the number of sites visited and the time it took.
The script makes use of the requests library and the built-in python standard time library.

The output of running the code is:

...
Read 107786 from https://www.facebook.com
Read 608312 from https://www.twitter.com/theghostyced
Read 11369 from https://www.google.com
Read 107786 from https://www.facebook.com
Read 608077 from https://www.twitter.com/theghostyced
Read 11369 from https://www.google.com
Read 107787 from https://www.facebook.com
Read 608077 from https://www.twitter.com/theghostyced
Read 11369 from https://www.google.com
Read 107351 from https://www.facebook.com
Read 608311 from https://www.twitter.com/theghostyced
Read 11369 from https://www.google.com
Read 107507 from https://www.facebook.com
Read 608312 from https://www.twitter.com/theghostyced
Read 11369 from https://www.google.com
Read 107918 from https://www.facebook.com
Read 608312 from https://www.twitter.com/theghostyced
Read 11369 from https://www.google.com
Read 107149 from https://www.facebook.com
Read 608312 from https://www.twitter.com/theghostyced
Read 11365 from https://www.google.com
Read 107445 from https://www.facebook.com
Read 608077 from https://www.twitter.com/theghostyced
Read 11369 from https://www.google.com
Read 107351 from https://www.facebook.com
Read 608312 from https://www.twitter.com/theghostyced
Read 11369 from https://www.google.com
Read 107482 from https://www.facebook.com
Read 608312 from https://www.twitter.com/theghostyced
Downloaded 90 in 17.5553081035614 seconds

Here, it takes the script 17.5 seconds to complete the task. Now let's try it again and see if we can speed it up using a multi-threaded approach.

Multi Threaded Process

# A simple python script that gets query a list of site
import requests
import time
import concurrent.futures


def get_single_site(url):
    with requests.get(url) as response:
        print(f"Read {len(response.content)} from {url}")


def get_all_sites(sites):
    with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
        executor.map(get_single_site, sites)


if __name__ == "__main__":
    start_time = time.time() # our scripts start time
    sites = [
        "https://www.google.com",
        "https://www.facebook.com",
        "https://www.twitter.com/theghostyced"
    ] * 30

    get_all_sites(sites)
    end_time = time.time() - start_time

    print(f"Downloaded {len(sites)} in {end_time} seconds")

In the code above, we imported the concurrent.futures module from the Python Standard library. The module has an Executor class which is where the ThreadPoolExecutor subclass comes from. Let's break down the ThreadPoolExecutor.

All the ThreadPoolExecutor subclass is doing is simply creating a Pool of Threads. The Executor part, controls how and when each of the threads in the pool will run.

The output of the above scripts is shown below:-

...
Read 608312 from https://www.twitter.com/theghostyced
Read 11354 from https://www.google.com
Read 107810 from https://www.facebook.com
Read 608312 from https://www.twitter.com/theghostyced
Read 11343 from https://www.google.com
Read 107823 from https://www.facebook.com
Read 608312 from https://www.twitter.com/theghostyced
Read 11326 from https://www.google.com
Read 107388 from https://www.facebook.com
Read 11350 from https://www.google.com
Read 608312 from https://www.twitter.com/theghostyced
Read 107787 from https://www.facebook.com
Read 608311 from https://www.twitter.com/theghostyced
Read 608077 from https://www.twitter.com/theghostyced
Read 11299 from https://www.google.com
Read 11367 from https://www.google.com
Read 608312 from https://www.twitter.com/theghostyced
Read 107785 from https://www.facebook.com
Read 11321 from https://www.google.com
Read 107800 from https://www.facebook.com
Read 107350 from https://www.facebook.com
Read 608076 from https://www.twitter.com/theghostyced
Read 608312 from https://www.twitter.com/theghostyced
Read 608312 from https://www.twitter.com/theghostyced
Read 608311 from https://www.twitter.com/theghostyced
Downloaded 90 in 6.443061351776123 seconds

Here, it takes the script 6.4 seconds to complete the task. Compared to the 17.5 seconds it took when running the code synchronously. You could be saying to yourself - it's only a 12 seconds difference, I can live with that. Imagine we had a larger amount of data, say 1000, then you would significantly see the difference in both approaches.

MULTIPROCESSING 🧩

A Process is simply an instance of a running program. To put it in layman's terms, when we write our computer programs/script in a text file and execute this program, it becomes a process that performs all the tasks mentioned in the program/script. A Process does not share any memory space with other processes like Threads.

Multiprocessing involves the use of two or more cores within a single computer system. By default, multithreading is not possible with the Python 🐍 programming language due to the GIL or Global Interpreter Lock hindrance.

The GIL Hindrance

Python was developed by Guido van Rossum in the 1980s and at that time, computers only made use of a single CPU and to increase memory management, Guido implemented the GIL which allows only one thread to hold the control of the Python interpreter. Meaning, making use of more than one CPU core or separate CPUs to run threads in parallel was not possible.

The multiprocessing module was introduced in order to bypass this.

NB -- The GIL does not prevent the creation of multiple threads. All the GIL does is make sure only one thread is executing Python code at a time; control still switches between threads. If you still having confusions, this article will definitely help you out.

Let's illustrate how a multi-processing operation would be written using the synchronous code we have above.

Synchronous Process

# A simple python script that gets query a list of site
import requests
import time


def get_single_site(url):
    with requests.get(url) as response:
        print(f"Read {len(response.content)} from {url}")


def get_all_sites(sites):
    for url in sites:
        get_single_site(url, session)


if __name__ == "__main__":
    start_time = time.time()
    urls = [
        "https://www.google.com",
        "https://www.facebook.com",
        "https://www.twitter.com/theghostyced"
    ] * 30

    get_all_sites(urls)
    end_time = time.time() - start_time

    print(f"Downloaded {len(sites)} in {end_time} seconds")

Multiprocessing Method

# A simple python script that gets query a list of site
import requests
import time
import multiprocessing


def get_single_site(url):
    with requests.get(url) as response:
        print(f"Read {len(response.content)} from {url}")


def get_all_sites(sites):
    with multiprocessing.Pool(5) as pool:
        pool.map(get_single_site, sites)


if __name__ == "__main__":
    start_time = time.time() # our scripts start time
    sites = [
        "https://www.google.com",
        "https://www.facebook.com",
        "https://www.twitter.com/theghostyced"
    ] * 30

    get_all_sites(sites)
    end_time = time.time() - start_time

    print(f"Downloaded {len(sites)} in {end_time} seconds")

Here we import the multiprocessing package from Python's standard library. The multiprocessing module comes with a few sub-classes Process and Pool.

Here we are making use of the Pool subclass. The Pool takes the number of workers or processes it needs to spawn as its first argument which is what is happening on with multiprocessing.Pool(5) as pool: line. On pool.map(get_single_site, sites) we use the map method that is provided to the Pool . The method takes in the function that needs to be called as the first argument and the iterable which is our urls list as the second. It then chops the iterable into a number of chunks which it submits to the process pool as separate tasks.

The output for this given operation is:-

...
Read 608423 from https://www.twitter.com/theghostyced
Read 108078 from https://www.facebook.com
Read 11386 from https://www.google.com
Read 11387 from https://www.google.com
Read 11304 from https://www.google.com
Read 11353 from https://www.google.com
Read 108021 from https://www.facebook.com
Read 107985 from https://www.facebook.com
Read 108022 from https://www.facebook.com
Read 608423 from https://www.twitter.com/theghostyced
Read 108079 from https://www.facebook.com
Read 608423 from https://www.twitter.com/theghostyced
Read 11340 from https://www.google.com
Read 608423 from https://www.twitter.com/theghostyced
Read 11321 from https://www.google.com
Read 608423 from https://www.twitter.com/theghostyced
Read 107985 from https://www.facebook.com
Read 608423 from https://www.twitter.com/theghostyced
Read 11384 from https://www.google.com
Read 107549 from https://www.facebook.com
Read 608423 from https://www.twitter.com/theghostyced
Read 11294 from https://www.google.com
Read 608423 from https://www.twitter.com/theghostyced
Read 107985 from https://www.facebook.com
Read 11360 from https://www.google.com
Read 609124 from https://www.twitter.com/theghostyced
Downloaded 90 in 6.056399154663086 seconds

Here, it takes the script 6 seconds to complete the task and just slightly faster than the threading solution. This is justified because the kind of operation being done is an I/O bounded on. Multiprocessing performs better when it does CPU bound operations like crunching some huge amount of data numbers.

CONCLUSION

I know right now you are itching to try it out for yourself so before you go, a few notes on when to use concurrency.

You need to figure out if your program is CPU-bound or I/O-bound first. Remember that I/O-bound programs are those that spend most of their time waiting for something to happen(making external calls or requests) while CPU-bound programs spend their time processing data or crunching numbers as fast as they can.

So for I/O-bound operation, MultiThreading would be the best way to go and for CPU-bound operations, Multiprocessing would be the right way to go.

👋 👋 👋 👋
I'm CED and I hope you enjoyed the tutorial on how to speed up the runtime of your Python 🐍 scripts.

⚡ ️Blazing Python 🐍 Scripts with Concurrency ⚡️️

The GIL Hindrance

CONCLUSION