Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
512 views
in Technique[技术] by (71.8m points)

python - Training and Testing accuracy not increasing for a CNN followed by a RNN for signature verification

I'm currently working on online signature verification. The dataset has a variable shape of (x, 7) where x is the number of points a person used to sign their signature. I have the following model:

    model = Sequential()
    #CNN
    model.add(Conv1D(filters=64, kernel_size=3, activation='sigmoid', input_shape=(None, 7)))
    model.add(MaxPooling1D(pool_size=3))
    model.add(Conv1D(filters=64, kernel_size=2, activation='sigmoid'))

    #RNN
    model.add(Masking(mask_value=0.0))
    model.add(LSTM(8))
    model.add(Dense(2, activation='softmax'))

    opt = Adam(lr=0.0001)
    model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
    model.summary()

    print(model.fit(x_train, y_train, epochs=100, verbose=2, batch_size=50))

    score, accuracy = model.evaluate(x_test,y_test, verbose=2)
    print(score, accuracy)

I know it may not be the best model but this is the first time I'm building a neural network. I have to use a CNN and RNN as it is required for my honours project. At the moment, I achieve 0.5142 as the highest training accuracy and 0.54 testing accuracy. I have tried increasing the number of epochs, changing the activation function, add more layers, moving the layers around, changing the learning rate and changing the optimizer.

Please share some advice on changing my model or dataset. Any help is much appreciated.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

For CNN-RNN, some promising things to try:

  • Conv1D layers: activation='relu', kernel_initializer='he_normal'
  • LSTM layer: activation='tanh', and recurrent_dropout=.1, .2, .3
  • Optimizer: Nadam, lr=2e-4 (Nadam may significantly outperform all other optimizers for RNNs)
  • batch_size: lower it. Unless you have 200+ batches in total, set batch_size=32; lower batch size better exploits the Stochastic mechanism of the optimizer and can improve generalization
  • Dropout: right after second Conv1D, with a rate .1, .2 - or, after first Conv1D, with a rate .25, .3, but only if you use SqueezeExcite (see below), else MaxPooling won't work as well
  • SqueezeExcite: shown to enhance all CNN performance across a large variety of tasks; Keras implementation you can use below
  • BatchNormalization: while your model isn't large, it's still deep, and may benefit from one BN layer right after second Conv1D
  • L2 weight decay: on first Conv1D, to prevent it from memorizing the input; try 1e-5, 1e-4, e.g. kernel_regularizer=l2(1e-4) # from keras.regularizers import l2
  • Preprocessing: make sure all data is normalized (or standardized if time-series), and batches are shuffled each epoch
def SqueezeExcite(_input):
    filters = _input._keras_shape[-1]

    se = GlobalAveragePooling1D()(_input)
    se = Reshape((1, filters))(se)
    se = Dense(filters//16,activation='relu',   
               kernel_initializer='he_normal', use_bias=False)(se)
    se = Dense(filters,    activation='sigmoid',
               kernel_initializer='he_normal', use_bias=False)(se)

    return multiply([_input, se])
# Example usage
x = Conv1D(filters=64, kernel_size=4, activation='relu', kernel_initializer='he_normal')(x)
x = SqueezeExcite(x) # place after EACH Conv1D

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...