Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
343 views
in Technique[技术] by (71.8m points)

python - Tensorflow 2.0 doesn't compute the gradient

I want to visualize the patterns that a given feature map in a CNN has learned (in this example I'm using vgg16). To do so I create a random image, feed through the network up to the desired convolutional layer, choose the feature map and find the gradients with the respect to the input. The idea is to change the input in such a way that will maximize the activation of the desired feature map. Using tensorflow 2.0 I have a GradientTape that follows the function and then computes the gradient, however the gradient returns None, why is it unable to compute the gradient?

import tensorflow as tf
import matplotlib.pyplot as plt
import time
import numpy as np
from tensorflow.keras.applications import vgg16

class maxFeatureMap():

    def __init__(self, model):

        self.model = model
        self.optimizer = tf.keras.optimizers.Adam()

    def getNumLayers(self, layer_name):

        for layer in self.model.layers:
            if layer.name == layer_name:
                weights = layer.get_weights()
                num = weights[1].shape[0]
        return ("There are {} feature maps in {}".format(num, layer_name))

    def getGradient(self, layer, feature_map):

        pic = vgg16.preprocess_input(np.random.uniform(size=(1,96,96,3))) ## Creates values between 0 and 1
        pic = tf.convert_to_tensor(pic)

        model = tf.keras.Model(inputs=self.model.inputs, 
                               outputs=self.model.layers[layer].output)
        with tf.GradientTape() as tape:
            ## predicts the output of the model and only chooses the feature_map indicated
            predictions = model.predict(pic, steps=1)[0][:,:,feature_map]
            loss = tf.reduce_mean(predictions)
        print(loss)
        gradients = tape.gradient(loss, pic[0])
        print(gradients)
        self.optimizer.apply_gradients(zip(gradients, pic))

model = vgg16.VGG16(weights='imagenet', include_top=False)


x = maxFeatureMap(model)
x.getGradient(1, 24)
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

This is a common pitfall with GradientTape; the tape only traces tensors that are set to be "watched" and by default tapes will watch only trainable variables (meaning tf.Variable objects created with trainable=True). To watch the pic tensor, you should add tape.watch(pic) as the very first line inside the tape context.

Also, I'm not sure if the indexing (pic[0]) will work, so you might want to remove that -- since pic has just one entry in the first dimension it shouldn't matter anyway.

Furthermore, you cannot use model.predict because this returns a numpy array, which basically "destroys" the computation graph chain so gradients won't be backpropagated. You should simply use the model as a callable, i.e. predictions = model(pic).


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...