Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
408 views
in Technique[技术] by (71.8m points)

python - Keras - Input a 3 channel image into LSTM

I have read a sequence of images into a numpy array with shape (7338, 225, 1024, 3) where 7338 is the sample size, 225 are the time steps and 1024 (32x32) are flattened image pixels, in 3 channels (RGB).

I have a sequential model with an LSTM layer:

model = Sequential()
model.add(LSTM(128, input_shape=(225, 1024, 3))

But this results in the error:

Input 0 is incompatible with layer lstm_1: expected ndim=3, found ndim=4

The documentation mentions that the input tensor for LSTM layer should be a 3D tensor with shape (batch_size, timesteps, input_dim), but in my case my input_dim is 2D.

What is the suggested way to input a 3 channel image into an LSTM layer in Keras?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

If you want the number of images to be a sequence (like a movie with frames), you need to put pixels AND channels as features:

input_shape = (225,3072)  #a 3D input where the batch size 7338 wasn't informed

If you want more processing before throwing 3072 features into an LSTM, you can combine or interleave 2D convolutions and LSTMs for a more refined model (not necessarily better, though, each application has its particular behavior).

You can also try to use the new ConvLSTM2D, which will take the five dimensional input:

input_shape=(225,32,32,3) #a 5D input where the batch size 7338 wasn't informed

I'd probably create a convolutional net with several TimeDistributed(Conv2D(...)) and TimeDistributed(MaxPooling2D(...)) before adding a TimeDistributed(Flatten()) and finally the LSTM(). This will very probably improve both your image understanding and the performance of the LSTM.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...