Simple RNN for token prediction

Hi,
I was exploring a simple character RNN and decided to construct it like this. I didn’t quite understand the difference between LSTM and LSTMCell. There is a comment in the code stating that CuDNN is not used in the latter case.

But my primary problem is with the shapes. It is just a sample sequence of characters.
I manage to draw the sample as they have recommended.
(e.g) “Hell” and “ello”

I am not using tf Dataset as I want to keep this very simple.

def draw_random_sample(text):
    sample = random_sample(text)
    split_sample = tf.strings.bytes_split(sample)
    tf.map_fn(map_fn, tf.strings.bytes_split(split_sample))
    # print(tf.stack(list[:-1]),tf.stack(list[1:]))
    return tf.stack(list[:-1]),tf.stack(list[1:])

I checked the output and it is indeed correct.

The error is

ValueError: Exception encountered when calling layer “sequential” (type Sequential).

Input 0 of layer "rnn" is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (None, 1)

I also have a question about batching. If I manage to fix the error and execute it I would like to use batches.

Could you take a look ?


EMBEDDING_DIM = 100
HIDDEN_DIM = 128
INPUT_DIM=len(string.printable)
OUTPUT_DIM=len(string.printable)
EPOCHS = 1
# Build the RNN model
def build_model():
    # Wrapping a LSTMCell in a RNN layer will not use CuDNN.
    keras.layers.Embedding(input_dim=INPUT_DIM, output_dim=EMBEDDING_DIM),
    lstm_layer = keras.layers.RNN(
        keras.layers.LSTMCell(HIDDEN_DIM),
                              return_sequences=True
    )
    model = keras.models.Sequential(
        [
            lstm_layer,
            keras.layers.Dense(OUTPUT_DIM),
        ]
    )
    return model

loss = tf.losses.SparseCategoricalCrossentropy(from_logits=True)
model = build_model()
model.build((1,1,INPUT_DIM))
model.compile(optimizer='adam', loss=loss)
print(model.summary())
history = model.fit(draw_random_sample(input), epochs=EPOCHS, verbose=2)

Thanks

This works.

def build_model():
    

# define model
    model = Sequential()
    model.add(Embedding(INPUT_DIM, EMBEDDING_DIM))
    model.add(keras.layers.LSTM(HIDDEN_DIM))
    model.add(keras.layers.Dense(OUTPUT_DIM))
    print(model.summary())
    return model