I'm using a Pool
of workers and want each of them to get initialized with a specific object. More precisely, the initialization cannot be parallelized, so that I plan to prepare the objects in the main process before to create the workers and give each worker one of these objects.
Here is my attempt :
import multiprocessing
import random
import time
class Foo:
def __init__(self, param):
# NO WAY TO PARALLELIZE THIS !!
print(f"Creating Foo with {param}")
self.param = param
def __call__(self, x):
time.sleep(1)
print("Do the computation", self)
return self.param + str(x)
def initializer():
global myfoo
param = random.choice(["a", "b", "c", "d", "e"])
myfoo = Foo(param)
def compute(x):
return myfoo(x)
multiple_results = []
with multiprocessing.Pool(2, initializer, ()) as pool:
for i in range(1, 10):
work = pool.apply_async(compute, (i,))
multiple_results.append(work)
print([res.get(timeout=2) for res in multiple_results])
Here is a possible output:
Creating Foo with b
Creating Foo with a
Do the computation <__main__.Foo object at 0x7f8d70aa7fd0>
Do the computation <__main__.Foo object at 0x7f8d70aa7fd0>
Do the computation <__main__.Foo object at 0x7f8d70aa7fd0>
Do the computation <__main__.Foo object at 0x7f8d70aa7fd0>
Do the computation <__main__.Foo object at 0x7f8d70aa7fd0>
Do the computation <__main__.Foo object at 0x7f8d70aa7fd0>
Do the computation <__main__.Foo object at 0x7f8d70aa7fd0>
Do the computation <__main__.Foo object at 0x7f8d70aa7fd0>
Do the computation <__main__.Foo object at 0x7f8d70aa7fd0>
['b1', 'a2', 'b3', 'a4', 'b5', 'a6', 'b7', 'a8', 'b9']
What is puzzling me is that the address of the Foo
object is always the same while the actual Foo
object is different as can be seen by the output: "b1", "a2"
.
My problem is that the two calls to initializer
are parallelized while I do not want to parallelize the construction of Foo
.
I want some magical method add_worker
to do something like this:
pool = multiprocessing.Pool()
for i in range(0,2):
foo = Foo()
poo.add_worker(initializer, (foo,))
Any ideas ?
EDIT: I solved my real live problem by making the import
of kera's VGGNet inside the process instead of on top of the file. See this answer
For the sake of curiosity, I remain interested in an answer.