Can't run tensorflow anymore

Wanderson_Silva · March 7, 2024, 2:13am

Hello to all. I am trying to train a model for testing purposes but am facing the issue:

python mnist_distributed.py
Traceback (most recent call last):
File “C:\Users\pc\Desktop\New folder\mnist_distributed.py”, line 3, in
import tensorflow as tf
File “C:\Users\pc\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow_init_.py”, line 45, in
from tensorflow.python import tf2 as _tf2
File “C:\Users\pc\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\tf2.py”, line 21, in
from tensorflow.python.platform import _pywrap_tf2
ImportError: DLL load failed while importing _pywrap_tf2: A dynamic link library (DLL) initialization routine failed.

Done everything I have found in the internet, including C++ redistribuitable, but no success. Am using Python 3.9.11 and tensorflow 2.15.0. The thing is that it was working fine before.
This is the script for the training:

import os
import datetime
import tensorflow as tf
from tensorflow.keras.callbacks import TensorBoard, Callback
# Atualizado para usar a nova API de precisão mista
from tensorflow.keras.mixed_precision import set_global_policy

# Configuração para precisão mista
set_global_policy('mixed_float16')  # Atualizado para definir a política global

# Configuração de logging para visualizar o processo de distribuição
tf.get_logger().setLevel('INFO')

# Defina a estratégia de treinamento distribuído
strategy = tf.distribute.MultiWorkerMirroredStrategy()

print("Number of devices: {}".format(strategy.num_replicas_in_sync))

class CustomCallback(Callback):
    def on_epoch_begin(self, epoch, logs=None):
        print(f"Iniciating epoch number {epoch+1}")

    def on_epoch_end(self, epoch, logs=None):
        print(f"End of epoch {epoch+1}")

with strategy.scope():
    # Construa o modelo com precisão mista
    model = tf.keras.models.Sequential([
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(128, activation='relu'),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, dtype='float32')  # Garantir a última camada com float32
    ])

    # Compilação do modelo com ajustes para treinamento distribuído
    # Atualizado para usar a função de perda recomendada
    model.compile(optimizer='adam',
                  loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
                  metrics=['accuracy'])

# Função para preprocessamento dos dados
def preprocess(image, label):
    image = tf.cast(image, tf.float32) / 255.0
    return image, label

# Carregamento e preparação do dataset MNIST com tf.data
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()

# Aplicação do preprocessamento e batching com prefetching
BATCH_SIZE = 64 * strategy.num_replicas_in_sync  # Ajuste o tamanho do lote conforme o número de réplicas
train_dataset = tf.data.Dataset.from_tensor_slices((train_images, train_labels)).map(preprocess).batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)
test_dataset = tf.data.Dataset.from_tensor_slices((test_images, test_labels)).map(preprocess).batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)

# Configuração do TensorBoard para monitoramento
log_dir = os.path.join("logs", "fit", datetime.datetime.now().strftime("%Y%m%d-%H%M%S"))
tensorboard_callback = TensorBoard(log_dir=log_dir, histogram_freq=1)

# Treinamento do modelo com o dataset preparado e CustomCallback para monitoramento
model.fit(train_dataset, epochs=10, validation_data=test_dataset, callbacks=[tensorboard_callback, CustomCallback()])

Appreciate any help!

Wanderson_Silva · March 7, 2024, 2:48pm

I was wandering if there is a correct sequence for the installation of the required components. For example: Visual C++ → Python → TensorFlow.

Renu_Patel · March 12, 2024, 10:10am

Hi @Wanderson_Silva

It seems, The correct compatible version of TensorFlow, Python and CUDA, cuDNN are not installed in your system to support GPU. Please try agian by upgrading the python version to 3.10 or downgrading the TensorFlow to the one older version.

I am able to run the above code in Google Colab using Python 3.10 and TensorFlow 2.15 without any error. (Please find the replicated gist here).

You can refer to this TF install official document to verify if you have followed all the instructed steps to install TensorFlow with GPU support as per your system OS.

Let us know if the issue still persists. Thank you.

Topic		Replies	Views
Tensorflow installation - DLL initialization routine failure General Discussion python , tfdata , ml	5	522	June 17, 2024
Tensorflow import eror after installation of tesnsorflow General Discussion install , tensorflow	1	116	May 31, 2024
Got this error while using tensorflow python for first time please guide TensorFlow python , tensorflow	4	135	October 15, 2024
PC Reboot when run tensorflow image clasification General Discussion install , help_request	12	780	January 18, 2023
Assertion Failed when tried to train the model TensorFlow	1	939	October 10, 2023

Can't run tensorflow anymore

Related topics