[FIXED] ValueError with Shapes using Bidirectional LSTM

Issue

I am trying to implement a Bidirectional LSTM for a sequence-to-sequence model. I have already one-hot-encoded my sequences with 12 total features. The input is 11 steps while the output is 23 steps. First, I coded this LSTM implementation that works with the first LSTM as the encoder and the second as the decoder.

model = Sequential()
model.add(LSTM(75, input_shape=(11, 12)))
model.add(RepeatVector(23))
model.add(LSTM(50, return_sequences=True))
model.add(TimeDistributed(Dense(12, activation='softmax')))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
X, y = generate_data(1, taskset, trainset)
model.fit(X, y, epochs=1, batch_size=32, verbose=1)

I then tried to turn this into a bidirectional LSTM as follows:

model = Sequential()
model.add(Bidirectional(LSTM(75, return_sequences=True), input_shape=(11,12), merge_mode='concat'))
model.add(Bidirectional(LSTM(50, return_sequences=True)))
model.add(TimeDistributed(Dense(12, activation='softmax')))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics= ['accuracy'])
model.summary()
X, y = generate_data(1, taskset, trainset)
model.fit(X, y, epochs=1, batch_size=32, verbose=1)

The goal is to use the first bidirectional LSTM as the encoder and the second bidirectional LSTM as the decoder. I removed the RepeatVector in the bidirectional implementation because it gave me a dimension error (needed dim=2, received dim=3). With the current bidirectional LSTM I am getting this error:

ValueError: Shapes (None, 23, 12) and (None, 11, 12) are incompatible

Any help with fixing the bidirectional LSTM implementation?

Solution

Simply setting return_sequences=False in your first bidirectional LSTM and adding as before RepeatVector(23) works fine

n_sample = 10
X = np.random.uniform(0,1, (n_sample, 11, 12))
y = np.random.randint(0,2, (n_sample, 23, 12))

model = Sequential()
model.add(Bidirectional(LSTM(75), input_shape=(11,12), merge_mode='concat'))
model.add(RepeatVector(23))
model.add(Bidirectional(LSTM(50, return_sequences=True)))
model.add(TimeDistributed(Dense(12, activation='softmax')))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics= ['accuracy'])

model.fit(X, y, epochs=3, batch_size=32, verbose=1)

Answered By – Marco Cerliani

Answer Checked By – Willingham (Easybugfix Volunteer)

Leave a Reply

(*) Required, Your email will not be published