In this talk we will discuss the 'multiprocessing' module, which allows parallel programming on multicore processors. We will do a comparison with the 'threading' module, then cover the core functionality and finally discuss applications and limitations.
When trying to improve the performance of a Python application on a multicore machine by means of parallelization, one typically employs either the threading or the multiprocessing module. While the former is used to spawn threads, the latter one allows the creation of subprocesses. Although threads are useful for e.g. I/O-bound problems, they do not allow for true parallelism due to CPython's Global Interpreter Lock (GIL). In contrast, the multiprocessing module can spawn subprocesses with their own interpreter instances, hence circumventing the GIL and allowing the usage of more than one core at the same time.
In many cases, the multiprocessing package allows to gain a significant speed-up by parallelization of computationally intensive code sections. We will cover most of the functionality of this module such as parallel mapping, asynchronous function evaluations and interprocess communication using pipes and queues. In particular, a parallel map can be often used as a drop-in alternative for a sequential mapping. We will go through some examples and discuss typical applications and pitfalls.