I'm wondering if there's a easy/standard way to "resize" a trained Tensorflow model.
Note that I'm not asking about resizing inputs, or resizing a tensor as a step within the Tensorflow computational graph, but instead I'm asking about changing the Tensorflow graph itself, with respect to the saved parameters from another pretrained model.
That is, if I have (for example) an existing, trained neural network with 3 hidden layers, is there a way to re-use the weights to initialize a neural network with an extra layer inserted (so 4 hidden layers instead)? Or if I want to resize an intermediate step, such that instead of having a hidden layer output 32 intermediate features I can expand it to 64 features.
I realize that the simplest way would be to simply throw out the existing weights and retrain the new architecture from scratch. But if I've already put in significant compute to train on a 32-wide model, it seems wasteful to throw all that out to examine a 64-wide one.
I also realize that such a process would not be completely automated (as there would be certain choices needed on how to do the transition), and that certain transformations would be difficult or impossible. But from what I can tell, I'm guessing that other transformations (e.g. increasing the output size of a hidden layer) should be relatively straightforward to do by "zeroing out" the weights associated with the new degrees of freedom.
Is there a generally accepted way of doing such architecture rearrangements?
question from:
https://stackoverflow.com/questions/65944601/loading-parameters-for-a-trained-tensorflow-model-to-a-model-with-slightly-diffe 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…