Why not specify the shape out of `TextVectorization` class to Keras model

kareemamr · September 14, 2021, 12:12pm

In the book “Deep Learning with Python, Second Edition”, the author uses the TextVectorization class to preprocess text into sequences as such:

max_length = 600
max_tokens = 20000
text_vectorization = TextVectorization(
    max_tokens=max_tokens,
    output_mode="int",
    output_sequence_length=max_length,  
)

where each is integer encoded like [5, 18, 1, 0, 53,...] and so each batch is of shape [batch_size, 600]. However when building the model he doesn’t specify the input shape: inputs = keras.Input(shape=(None,), dtype="int64"). Any idea why that is?

Ashwini_Gadag · November 23, 2021, 8:14am

After text preprocessing using Textvectorization, we define input layer with shape (1, ).

#Start by creating an explicit input layer. It needs to have a shape of
#(1,) (because we need to guarantee that there is exactly one string
#input per batch), and the dtype needs to be ‘string’.
model.add(tf.keras.Input(shape=(1,), dtype=tf.string))

For more details refer tf.keras.layers.TextVectorization

Topic		Replies	Views
Why does shape of the input is empty? General Discussion help_request	3	572	September 4, 2023
Shape of the integer sequences in TextVectorization layer General Discussion api , keras , help_request	6	896	June 5, 2022
Incompatible Shapes? General Discussion models , keras , help_request	1	2487	August 23, 2022
Help! I dont understand "input_shape" more General Discussion models , datasets , keras , help_request	3	2267	July 15, 2021
Input array shape (not same sample) General Discussion models , keras , help_request	1	1603	August 11, 2021

Why not specify the shape out of `TextVectorization` class to Keras model

Related topics