I have a non trivial input pipeline that from_generator
is perfect for...
dataset = tf.data.Dataset.from_generator(complex_img_label_generator,
(tf.int32, tf.string))
dataset = dataset.batch(64)
iter = dataset.make_one_shot_iterator()
imgs, labels = iter.get_next()
Where complex_img_label_generator
dynamically generates images and returns a numpy array representing a (H, W, 3)
image and a simple string
label. The processing not something I can represent as reading from files and tf.image
operations.
My question is about how to parallise the generator? How do I have N of these generators running in their own threads.
One thought was to use dataset.map
with num_parallel_calls
to handle the threading; but the map operates on tensors... Another thought was to create multiple generators each with it's own prefetch
and somehow join them, but I can't see how I'd join N generator streams?
Any canonical examples I could follow?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…