Bilstm and lstm Graph execution error

Hi, I want to use BiLSTM mode for text classification tasks. I use a data generator to get already batched and embedded files that have been split into 64 different files(for training) and 4 files each for test and valid. The shape of the numpy.array is (512,768), which I have used in the CNN task before and works fine.
However, when I try to use it in the BiLTSM model below

BiLSTM_model = tf.keras.models.Sequential(name = "BiLSTM_model")
BiLSTM_model.add(layers.Bidirectional(layers.LSTM(300), input_shape = (512,768)))
BiLSTM_model.add(layers.Dense(300, activation='relu'))
BiLSTM_model.add(layers.Dropout(0.5))
BiLSTM_model.add(layers.Dense(300, activation='relu'))
BiLSTM_model.add(layers.Dropout(0.5))
BiLSTM_model.add(layers.Dense(1, activation='sigmoid'))
adam = tf.keras.optimizers.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999)
BiLSTM_model.compile(optimizer=adam, loss='binary_crossentropy', metrics=["accuracy"])
BiLSTM_model.summary()

I get this error immediately and do not start the epoch. when I try on this fit

history = BiLSTM_model.fit(train_data_gen, 
                        steps_per_epoch = 64, 
                        epochs = 20, 
                        validation_data = valid_data_gen,
                        validation_steps=4,
                        verbose = 1)
InternalError: Graph execution error:

Failed to call ThenRnnForward with model config: [rnn_mode, rnn_input_mode, rnn_direction_mode]: 2, 0, 0 , [num_layers, input_size, num_units, dir_count, max_seq_length, batch_size, cell_num_units]: [1, 768, 300, 1, 512, 1840, 300] 
	 [[{{node CudnnRNN}}]]
	 [[BiLSTM_model/bidirectional/backward_lstm/PartitionedCall]] [Op:__inference_train_function_5912]

So I thought the model was too large, so I reduced the model to LSTM and tried it again with one step for train and validation.
model

#model based on VuldeePecker model
BiLSTM_model = tf.keras.models.Sequential(name = "BiLSTM_model")
BiLSTM_model.add(layers.LSTM(128, input_shape = (512,768)))
BiLSTM_model.add(layers.Dense(1, activation='sigmoid'))
adam = tf.keras.optimizers.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999)
BiLSTM_model.compile(optimizer=adam, loss='binary_crossentropy', metrics=["accuracy"])
BiLSTM_model.summary()

it able to run one epoch but still crash and stop working with this error

InternalError: Graph execution error:

Epoch 1/20
1/1 [==============================] - ETA: 0s - loss: 0.6879 - accuracy: 0.9179
---------------------------------------------------------------------------
InternalError                             Traceback (most recent call last)
Cell In[8], line 1
----> 1 history = BiLSTM_model.fit(train_data_gen, 
      2                         steps_per_epoch = 1, 
      3                         epochs = 20, 
      4                         validation_data = valid_data_gen,
      5                         validation_steps=1,
      6                         verbose = 1)
      7 CNN_model.save("./finish_model/w2v_BiLSTM_model.h5")
      8 with open("./history/w2v_BiLSTM_history.pkl", "wb") as file_pi:

File ~\AppData\Local\anaconda3\envs\python39\lib\site-packages\keras\utils\traceback_utils.py:70, in filter_traceback.<locals>.error_handler(*args, **kwargs)
     67     filtered_tb = _process_traceback_frames(e.__traceback__)
     68     # To get the full stack trace, call:
     69     # `tf.debugging.disable_traceback_filtering()`
---> 70     raise e.with_traceback(filtered_tb) from None
     71 finally:
     72     del filtered_tb

File ~\AppData\Local\anaconda3\envs\python39\lib\site-packages\tensorflow\python\eager\execute.py:54, in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     52 try:
     53   ctx.ensure_initialized()
---> 54   tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
     55                                       inputs, attrs, num_outputs)
     56 except core._NotOkStatusException as e:
     57   if name is not None:

InternalError: Graph execution error:

Failed to call ThenRnnForward with model config: [rnn_mode, rnn_input_mode, rnn_direction_mode]: 2, 0, 0 , [num_layers, input_size, num_units, dir_count, max_seq_length, batch_size, cell_num_units]: [1, 768, 128, 1, 512, 3666, 128] 
	 [[{{node CudnnRNN}}]]
	 [[BiLSTM_model/lstm/PartitionedCall]] [Op:__inference_test_function_3644]

I use Windows 10 OS with Nvidia GeForce RTX 3090 and cuda 12.2. Tensorflow v2.10.1 and Keras v2.10.0 in python 3.9.17 in jupyter notebook

any help would be appliciate!
Thank you in advance.

Hi @nekon

Welcome to the TensorFlow Forum!

It seems you have installed the incompatible version of CUDA with TensorFlow version 2.10 which might be causing the above error. Please try again by installing the compatible version of cuDNN 8.1 and CUDA 11.2 for TensorFlow 2.10 as mentioned in this tested build configuration.

Let us know if the issue still persists. Thank you.