Can’t find libdevice directory ${CUDA_DIR}/nvvm/libdevice

I’ve followed the instructions at:

I see this error has been raised number of times and tried the solutions posted at:
http://discuss.ai.google.dev/t/libdevice-not-found-why-is-it-not-found-in-the-searched-path/3419/10
http://discuss.ai.google.dev/t/cant-find-libdevice-directory-cuda-dir-nvvm-libdevice/11896
http://discuss.ai.google.dev/t/installation-instruction-on-website-are-incomplete/16146

In the last discussion linked above, there was the comment: " Could you please confirm the solution provided in the step 6 in the pip step-by-step_instructions can resolve those errors by @Kiran_Sai_Ramineni

The steps listed have changed and are no longer numbered.

I have already set all the environment variables, and checked that the executable lib/nvvm/libdevice/libdevice.10.bc is located under the env I activated.

Note: The precise error now is: libdevice is required by this HLO module but was not found at /Ext4FastData/bin/miniconda3/envs/tf-cert3/lib/nvvm/libdevice/libdevice.10.bc

even though that file exists and has been copied across from the original location, and I set XLA_FLAGS=–xla_gpu_cuda_data_dir=$CONDA_PREFIX/lib/

Hi @brendonwp

Could you provide us some more details to better understand the underlying issue like which system OS you are using, the installed Python/TensorFlow version and what are the steps you followed specifically to install TensorFlow with GPU support in your system? Thank you.

Hi @Renu_Patel
Here are my system details. I am studying for the TensorFlow Certificate exam, but cannot take it until I get my environment set up. I have followed the instructions online as closely as possible. Previously I have successfully installed the environment in February and June 2023.

Details follow.

Hardware: Dell Inspiron 7472. i7 processor. 16GB RAM. About 5 years old

System OS: Ubuntu 22.04
Python: 3.9.2
Tensorflow: 2.13.0

Steps Followed:

conda create --name tf-cert python=3.9.2

conda activate tf-cert

conda install -c conda-forge cudatoolkit=11.8.0

python3 -m pip install nvidia-cudnn-cu11==8.6.0.163 tensorflow==2.13.0 tensorflow-datasets==4.9.2

Setting up the environment paths and testing TensorFlow installation

mkdir -p $CONDA_PREFIX/etc/conda/activate.d

echo ‘CUDNN_PATH=$(dirname $(python -c “import nvidia.cudnn;print(nvidia.cudnn.file)”))’ >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh

echo ‘export LD_LIBRARY_PATH=$CUDNN_PATH/lib:$CONDA_PREFIX/lib/:$LD_LIBRARY_PATH’ >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh

source $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh

python3 -c “import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))”
python3 -c “import tensorflow as tf; print(tf.config.list_physical_devices(‘GPU’))”

Tests successful

Python Packages

python3 -m pip install pip install numpy==1.24.3 pandas==2.0.3 Pillow==10.0.0 scipy==1.10.1 matplotlib==3.4.2 seaborn= 0.12.2

And my system driver etc, from nvidia-smi:

NVIDIA-SMI 470.223.02 Driver Version: 470.223.02 CUDA Version: 11.4

@brendonwp, Could you please send an email to the TF certification exam team (tensorflow-certificate-support@google.com) for further assistance in this. Thank You.

1 Like

Done. I’ll report back

For interest of other forumites: I was pointed to the official documentation by the TF certification exam team on 30 Jan.

Two days ago I sent the mail below to the TF certification exam team because I could not make sense of the GPU part of the TensorFlow installation process. I have already copied this as a DM to @Renu_Patel so no response required from them.

Since this is a long-running issue I’m posting this here, and hope someone else finds this helpful, pending clarity from the exam team.

###############################################
Hi there

Thanks for your response. I’ve been working on a successful CPU installation. Now I’ve had a chance to come back to this and recheck my GPU installation. I did follow the instructions you linked carefully.

Since my problem arises after TensorFlow has passed the basic sanity checks given 4. Verify Installation, I suspect it is a cuDNN issue or an interaction with TensorFlow. But I cannot use the suggested line

pip install tensorflow[and-cuda]

``since I’ve been told that this automatically installs TensorFlow 2.15, not 2.13. So I’ve been using conda instead for this step only.

Please advise me whether I can customise pip install tensorflow[and-cuda] to install TensorFlow 2.13 instead, and the code I should use.

Thanks in advance!

Kind regards

Brendon

1 Like