I have a simple model for a project which takes in some weather data and predicts a temperature. The structure is the following:
- 24 inputs with 10 features (one input per hour so 24 hours of input data)
- 1 output being a temperature
The model works, I'm all fine with that, however I need to present the way the model works and I'm unsure how some values I see about the model describe it. I'm struggling to visually represent the inner structure (as in the neurons or nodes).
Here is the model (The static input_shape is not set statically in my code, it is purely to help answer the question):
forward_layer = tf.keras.layers.LSTM(units=32, return_sequences=True)
backward_layer = tf.keras.layers.LSTM(units=32, return_sequences=True, go_backwards=True)
bilstm_model = tf.keras.models.Sequential([
tf.keras.layers.Bidirectional(forward_layer, backward_layer=backward_layer, input_shape=(24, 10)),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.LSTM(32, return_sequences=True),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.LSTM(32, return_sequences=True),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.LSTM(32, return_sequences=False),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(units=1)
])
Note that the reason I seperated the layers for the Bidirectional
is because I am running this with TFLite and had issues if the two weren't pre-defined but the inner workings are the same as far as I understand it.
Now as some experts have probably already figured out just looking at this, the input shape is (32, 24, 10)
and the output shape is (32, 1)
where the input is under the (batch_size, sequence_length, features) format and the output is under the (batch_size, units) format.
As well as I feel I understand what the numbers represent, I can't quite wrap my head around how the model would "look". I mainly struggle to differentiate the structure during training and during predictions.
Here are the ideas I have (note that I will be describing the model based on neurons):
- Input is a 24x10 DataFrame
- The input is therefore 240 values which can be seen as 24 sets of 10 neurons ? (Not sure I'm right here)
return_sequences
means that the 240 values propagate throughout the model ? (Still not sure here)
- The 'neuron' structure would ressemble this:
Input(240 neurons) -> BiLSTM(240 neurons) ->
Dropout(240 neurons, 20% drop rate) -> LSTM(240 neurons) ->
Dropout(240 neurons, 20% drop rate) -> LSTM(240 neurons) ->
Dropout(240 neurons, 20% drop rate) -> LSTM(240 neurons) ->
Dropout(? neurons, 20% drop rate) -> Dense(1 neurons) = Output
If I'm not mistaken, the Dropout
layer isn't a layer strictly speaking but it stops (in this case) 20% of the input neurons (or output, I'm not sure) from activating 20% of the output neurons.
I'd really appreciate the help on visualising the structure so thanks in advance to any brave soul ready to help me out
EDIT
Here is an example of what I am trying to get out of the numbers. Note that this image ONLY represents the first BiLSTM layer, not the whole model.
The x ?
in the image represent what I'm trying to understand, ie how many layers are there and how many neurons are there in each layer