Hi,
I was exploring a simple character RNN and decided to construct it like this. I didn’t quite understand the difference between LSTM and LSTMCell. There is a comment in the code stating that CuDNN is not used in the latter case.
But my primary problem is with the shapes. It is just a sample sequence of characters.
I manage to draw the sample as they have recommended.
(e.g) “Hell” and “ello”
I am not using tf Dataset as I want to keep this very simple.
def draw_random_sample(text):
sample = random_sample(text)
split_sample = tf.strings.bytes_split(sample)
tf.map_fn(map_fn, tf.strings.bytes_split(split_sample))
# print(tf.stack(list[:-1]),tf.stack(list[1:]))
return tf.stack(list[:-1]),tf.stack(list[1:])
I checked the output and it is indeed correct.
The error is
ValueError: Exception encountered when calling layer “sequential” (type Sequential).
Input 0 of layer "rnn" is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (None, 1)
I also have a question about batching. If I manage to fix the error and execute it I would like to use batches.
Could you take a look ?
EMBEDDING_DIM = 100
HIDDEN_DIM = 128
INPUT_DIM=len(string.printable)
OUTPUT_DIM=len(string.printable)
EPOCHS = 1
# Build the RNN model
def build_model():
# Wrapping a LSTMCell in a RNN layer will not use CuDNN.
keras.layers.Embedding(input_dim=INPUT_DIM, output_dim=EMBEDDING_DIM),
lstm_layer = keras.layers.RNN(
keras.layers.LSTMCell(HIDDEN_DIM),
return_sequences=True
)
model = keras.models.Sequential(
[
lstm_layer,
keras.layers.Dense(OUTPUT_DIM),
]
)
return model
loss = tf.losses.SparseCategoricalCrossentropy(from_logits=True)
model = build_model()
model.build((1,1,INPUT_DIM))
model.compile(optimizer='adam', loss=loss)
print(model.summary())
history = model.fit(draw_random_sample(input), epochs=EPOCHS, verbose=2)
Thanks