I am a beginner to tensorflow. I'm currently working on a system with 2 GPUs each of 12GB. I want to implement model parallelism across the two GPUs to train large models. I have been looking through all over the internet, SO, tensorflow documentation, etc, i was able to find the explanations of model parallelism and its results but nowhere did i find a small tutorial or small code snippets on how to implement it using tensorflow. I mean we have to exchange activations after every layer right? So how do we do that? Is there a specific or cleaner ways of implementing model parallelism in tensorflow? It would be very helpful if you could suggest me a place where i can learn to implement it or a simple code like mnist training on multiple GPU using 'MODEL PARALLELISM'.
Note: I have done data parallelism like in CIFAR10 - multi gpu tutorial but i haven't found any implementation of model parallelism.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…