Keras Transformer Implementation Error

QichengMa · January 13, 2025, 12:23am

Dear All,

I am new to implement Transformer with Keras, I could not figure out the solution for this following error:

Here is my input tokens:

(base) $ cat trainingSet_7mer_1.token.txt
18,0,11,10,7,14,2
18,0,1,6,7,18,2
13,0,11,15,7,13,2
18,0,1,12,7,5,2
10,15,11,9,7,10,2
10,15,11,9,7,5,2
13,0,1,16,7,3,2
13,0,11,16,7,2,2
13,15,1,10,2,10,2
0,15,6,13,12,16,13
13,15,1,0,7,3,2
18,15,11,0,7,5,2
13,15,1,0,7,13,2
13,15,1,18,7,15,2
18,0,1,5,2,16,2
12,19,15,16,2,7,19
0,11,15,16,2,7,9
19,7,15,0,8,2,17
18,15,11,5,2,16,2
10,0,1,15,7,3,2
9,10,15,16,2,7,9
19,13,15,16,2,7,9
10,10,15,16,2,7,1
18,0,1,5,7,5,2
18,15,1,15,7,9,2
10,15,11,19,7,3,2
18,15,11,0,2,16,2
7,6,15,3,1,11,17
10,15,11,9,7,19,2
10,0,11,15,7,5,2
18,0,11,15,7,5,2
18,15,1,19,2,6,2
18,15,11,2,2,8,2
10,18,15,16,2,7,9
18,0,1,5,7,12,2
10,15,11,19,7,15,2
6,12,7,7,7,15,17
13,0,11,15,7,8,2
18,0,11,15,2,12,2
18,0,11,0,2,12,2
3,10,15,8,0,11,17
7,6,7,9,19,1,17
18,15,11,7,2,3,2
18,0,1,19,7,5,2
15,19,15,16,2,7,1
18,15,11,8,2,19,2
18,0,1,0,2,19,2
10,8,11,10,16,19,15
10,0,11,5,7,13,2
13,19,15,16,2,7,9
13,15,1,9,2,3,2
13,0,1,0,7,12,2
7,15,7,9,5,2,17
18,19,15,16,2,7,19
10,15,11,0,7,13,2
18,15,11,7,2,12,2
18,0,11,16,7,5,2
16,1,2,7,2,9,0
5,19,15,16,2,7,19
13,8,15,16,2,7,9
16,1,9,7,2,2,0
13,19,1,15,3,9,11
13,15,11,16,7,14,2
10,15,11,10,7,14,2
(base) $ python ~/bin/train_transformer2.py 7 20 trainingSet_7mer_1.token.txt 64 trainingSet_7mer_1.value.txt testSet_7mer_obs_1.token.txt 64 testSet_7mer_obs_1.value.txt testSet_7mer_obs_1.value.predicted.txt 64 150 8 0.25 0.004 transformer.1
2025-01-12 19:13:26.420643: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2025-01-12 19:13:26.458060: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2025-01-12 19:13:26.458352: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-01-12 19:13:27.326611: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
[[18 0 11 10 7 14 2]
[18 0 1 6 7 18 2]
[13 0 11 15 7 13 2]
[18 0 1 12 7 5 2]
[10 15 11 9 7 10 2]
[10 15 11 9 7 5 2]
[13 0 1 16 7 3 2]
[13 0 11 16 7 2 2]
[13 15 1 10 2 10 2]
[ 0 15 6 13 12 16 13]
[13 15 1 0 7 3 2]
[18 15 11 0 7 5 2]
[13 15 1 0 7 13 2]
[13 15 1 18 7 15 2]
[18 0 1 5 2 16 2]
[12 19 15 16 2 7 19]
[ 0 11 15 16 2 7 9]
[19 7 15 0 8 2 17]
[18 15 11 5 2 16 2]
[10 0 1 15 7 3 2]
[ 9 10 15 16 2 7 9]
[19 13 15 16 2 7 9]
[10 10 15 16 2 7 1]
[18 0 1 5 7 5 2]
[18 15 1 15 7 9 2]
[10 15 11 19 7 3 2]
[18 15 11 0 2 16 2]
[ 7 6 15 3 1 11 17]
[10 15 11 9 7 19 2]
[10 0 11 15 7 5 2]
[18 0 11 15 7 5 2]
[18 15 1 19 2 6 2]
[18 15 11 2 2 8 2]
[10 18 15 16 2 7 9]
[18 0 1 5 7 12 2]
[10 15 11 19 7 15 2]
[ 6 12 7 7 7 15 17]
[13 0 11 15 7 8 2]
[18 0 11 15 2 12 2]
[18 0 11 0 2 12 2]
[ 3 10 15 8 0 11 17]
[ 7 6 7 9 19 1 17]
[18 15 11 7 2 3 2]
[18 0 1 19 7 5 2]
[15 19 15 16 2 7 1]
[18 15 11 8 2 19 2]
[18 0 1 0 2 19 2]
[10 8 11 10 16 19 15]
[10 0 11 5 7 13 2]
[13 19 15 16 2 7 9]
[13 15 1 9 2 3 2]
[13 0 1 0 7 12 2]
[ 7 15 7 9 5 2 17]
[18 19 15 16 2 7 19]
[10 15 11 0 7 13 2]
[18 15 11 7 2 12 2]
[18 0 11 16 7 5 2]
[16 1 2 7 2 9 0]
[ 5 19 15 16 2 7 19]
[13 8 15 16 2 7 9]
[16 1 9 7 2 2 0]
[13 19 1 15 3 9 11]
[13 15 11 16 7 14 2]
[10 15 11 10 7 14 2]]
Traceback (most recent call last):
File “/home/kggx609/bin/train_transformer2.py”, line 331, in
main(args)
File “/home/kggx609/bin/train_transformer2.py”, line 292, in main
model = create_transformer_model((time_steps) , feature_channels, time_steps, number_of_heads, d_k, d_v, d_model, d_ff, n, dropout)
File “/home/kggx609/bin/train_transformer2.py”, line 222, in create_transformer_model
x = Encoder(enc_vocab_size, input_seq_length, h, d_k, d_v, d_model, d_ff, n, dropout_rate)(inputs)
File “/home/kggx609/.local/lib/python3.8/site-packages/keras/src/utils/traceback_utils.py”, line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/scratch/tmp_user_data/kggx609/autograph_generated_file359jy56n.py", line 26, in tf__call
ag.for_stmt(ag__.converted_call(ag__.ld(enumerate), (ag__.ld(self).encoder_layer,), None, fscope), None, loop_body, get_state, set_state, (‘x’,), {‘iterate_names’: ‘(i, layer)’})
File "/scratch/tmp_user_data/kggx609/autograph_generated_file359jy56n.py", line 23, in loop_body
x = ag.converted_call(ag__.ld(layer), (ag__.ld(x), None, True), None, fscope)
File "/scratch/tmp_user_data/kggx609/autograph_generated_fileb0k9akbt.py", line 10, in tf__call
multihead_output = ag.converted_call(ag__.ld(self).multihead_attention, (ag__.ld(x), ag__.ld(x), ag__.ld(x), ag__.ld(padding_mask)), None, fscope)
File "/scratch/tmp_user_data/kggx609/autograph_generated_filejshn3x2p.py", line 13, in tf__call
o_reshaped = ag.converted_call(ag__.ld(self).attention, (ag__.ld(q_reshaped), ag__.ld(k_reshaped), ag__.ld(v_reshaped), ag__.ld(self).d_k, ag__.ld(mask)), None, fscope)
File "/scratch/tmp_user_data/kggx609/autograph_generated_file9753gee4.py", line 10, in tf__call
scores = (ag.converted_call(ag__.ld(tf).matmul, (ag__.ld(queries), ag__.ld(keys)), dict(transpose_b=True), fscope) / ag__.converted_call(ag__.ld(math).sqrt, (ag__.converted_call(ag__.ld(tf).cast, (ag__.ld(d_k), ag__.ld(np).float16), None, fscope),), None, fscope))
TypeError: Exception encountered when calling layer “encoder” (type Encoder).

in user code:

File "/home/kggx609/bin/train_transformer2.py", line 216, in call  *
    x = layer(x, None, True)
File "/home/kggx609/.local/lib/python3.8/site-packages/keras/src/utils/traceback_utils.py", line 70, in error_handler  **
    raise e.with_traceback(filtered_tb) from None
File "/scratch/tmp_user_data/kggx609/__autograph_generated_fileb0k9akbt.py", line 10, in tf__call
    multihead_output = ag__.converted_call(ag__.ld(self).multihead_attention, (ag__.ld(x), ag__.ld(x), ag__.ld(x), ag__.ld(padding_mask)), None, fscope)
File "/scratch/tmp_user_data/kggx609/__autograph_generated_filejshn3x2p.py", line 13, in tf__call
    o_reshaped = ag__.converted_call(ag__.ld(self).attention, (ag__.ld(q_reshaped), ag__.ld(k_reshaped), ag__.ld(v_reshaped), ag__.ld(self).d_k, ag__.ld(mask)), None, fscope)
File "/scratch/tmp_user_data/kggx609/__autograph_generated_file9753gee4.py", line 10, in tf__call
    scores = (ag__.converted_call(ag__.ld(tf).matmul, (ag__.ld(queries), ag__.ld(keys)), dict(transpose_b=True), fscope) / ag__.converted_call(ag__.ld(math).sqrt, (ag__.converted_call(ag__.ld(tf).cast, (ag__.ld(d_k), ag__.ld(np).float16), None, fscope),), None, fscope))

TypeError: Exception encountered when calling layer 'encoder_layer' (type EncoderLayer).

in user code:

    File "/home/kggx609/bin/train_transformer2.py", line 175, in call  *
        multihead_output = self.multihead_attention(x, x, x, padding_mask)
    File "/home/kggx609/.local/lib/python3.8/site-packages/keras/src/utils/traceback_utils.py", line 70, in error_handler  **
        raise e.with_traceback(filtered_tb) from None
    File "/scratch/tmp_user_data/kggx609/__autograph_generated_filejshn3x2p.py", line 13, in tf__call
        o_reshaped = ag__.converted_call(ag__.ld(self).attention, (ag__.ld(q_reshaped), ag__.ld(k_reshaped), ag__.ld(v_reshaped), ag__.ld(self).d_k, ag__.ld(mask)), None, fscope)
    File "/scratch/tmp_user_data/kggx609/__autograph_generated_file9753gee4.py", line 10, in tf__call
        scores = (ag__.converted_call(ag__.ld(tf).matmul, (ag__.ld(queries), ag__.ld(keys)), dict(transpose_b=True), fscope) / ag__.converted_call(ag__.ld(math).sqrt, (ag__.converted_call(ag__.ld(tf).cast, (ag__.ld(d_k), ag__.ld(np).float16), None, fscope),), None, fscope))

    TypeError: Exception encountered when calling layer 'multi_head_attention' (type MultiHeadAttention).

    in user code:

        File "/home/kggx609/bin/train_transformer2.py", line 124, in call  *
            o_reshaped = self.attention(q_reshaped, k_reshaped, v_reshaped, self.d_k, mask)
        File "/home/kggx609/.local/lib/python3.8/site-packages/keras/src/utils/traceback_utils.py", line 70, in error_handler  **
            raise e.with_traceback(filtered_tb) from None
        File "/scratch/tmp_user_data/kggx609/__autograph_generated_file9753gee4.py", line 10, in tf__call
            scores = (ag__.converted_call(ag__.ld(tf).matmul, (ag__.ld(queries), ag__.ld(keys)), dict(transpose_b=True), fscope) / ag__.converted_call(ag__.ld(math).sqrt, (ag__.converted_call(ag__.ld(tf).cast, (ag__.ld(d_k), ag__.ld(np).float16), None, fscope),), None, fscope))

        TypeError: Exception encountered when calling layer 'dot_product_attention' (type DotProductAttention).

        in user code:

            File "/home/kggx609/bin/train_transformer2.py", line 68, in call  *
                scores = tf.matmul(queries, keys, transpose_b=True) / math.sqrt(tf.cast(d_k, np.float16))

            TypeError: must be real number, not Tensor


        Call arguments received by layer 'dot_product_attention' (type DotProductAttention):
          • queries=tf.Tensor(shape=(None, 8, 7, None), dtype=float32)
          • keys=tf.Tensor(shape=(None, 8, 7, None), dtype=float32)
          • values=tf.Tensor(shape=(None, 8, 7, None), dtype=float32)
          • d_k=64
          • mask=None


    Call arguments received by layer 'multi_head_attention' (type MultiHeadAttention):
      • queries=tf.Tensor(shape=(None, 7, 512), dtype=float32)
      • keys=tf.Tensor(shape=(None, 7, 512), dtype=float32)
      • values=tf.Tensor(shape=(None, 7, 512), dtype=float32)
      • mask=None


Call arguments received by layer 'encoder_layer' (type EncoderLayer):
  • x=tf.Tensor(shape=(None, 7, 512), dtype=float32)
  • padding_mask=None
  • training=True

Call arguments received by layer “encoder” (type Encoder):
• input_sentence=tf.Tensor(shape=(None, 7), dtype=float32)

Inputs are token of 20 dim from 0 to 19, it is of shape(64,7,20) of type np.int8, I appreciate your help, thanks in advance!

Kiran_Sai_Ramineni · January 16, 2025, 9:43am

Hi @QichengMa, If possible could you please provide the stand alone code to reproduce the issue. And also let us know the tensorflow and keras version you are using. Thank You.

Topic		Replies	Views
New to Tensorflow and Keras - Cant get GPU to work General Discussion gpu	2	2044	October 25, 2023
You must feed a value for placeholder tensor 'gradients/.../split_dim' with dtype int32 General Discussion models , keras	7	4991	May 24, 2023
Cannot run on Nvidia GPU General Discussion nlp , keras , gpu , windows , help_request	9	6167	February 2, 2022
TF encountered strange errors when using GPU General Discussion gpu , tensorflow	1	68	May 19, 2024
Not able to run my code on gpu General Discussion gpu	1	327	January 22, 2024

Keras Transformer Implementation Error

Related topics