Friday, February 17, 2017

Sharing a Python generator across multiple multiprocessing processes.

Sometimes I need to do some work on a seemingly endless set of data.  I often use generators to create the endless data.  Say, for example I were trying to brute-force crack a zip file's password.  I'd create a generator that methodically and continually creates new passwords for the cracking to to use in attempting to unzip the file.

Once I have my generator in place, I have a problem.  What if I want to spread this brute-force attack across all the cores in my system to speed up this slow endless effort.  One would think that multiprocessing will solve the problem.

Well, no.  Generators aren't natively shared across processes in Python.  They're copied to each process.  This means that if you try to use the generator with multiprocessing, each process will get it's own copy of the generator and each process would get the same values from each copy.

To solve that, I devised a quick and dirty solution:  Place the generator in it's own process and then have each worker process request the next value from it via inter-process communication using the multi-processing Pipe class.

To that end, here's an example.  I post this here mainly to jog my memory next time I need to do this, but if you find it useful, great!


  1. Functionally sounds similar to
    Any advantages to the above?
    I'm wary of projects with one committer and no stars, but it does seem to encapsulate much of this.

  2. # For this sample, I wanted just one of the consumers to break out of
    # it's loop before consuming the whole generator output.

    Why did you want to do this? Is this something I have to do?

  3. I like your blog, I read this blog please update more content on hacking, further check it once at python online course