Hi
Sorry for the noob question.
I am trying to run tensorflow:latest-gpu-jupyter in a VS Code docker container.
If i run by container using the docker externsion image (right click/run interactive) i notice there is no --gpus all option so understandably I dont see my gpu.
If i add --gpus all and run in terminal i get this long error message.
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.
Finally if i run using VS code terminal adding ‘sudo’ i.e.
sudo docker run --rm -it --gpus all -p 8888:8888/tcp tensorflow/tensorflow:latest-gpu-jupyter
It appears to run correctly but i dont see the container in the containers section of VS Code (i guess as it is running the container as root rather than the user?)
Any suggestions how i do this in VS Code or how i get rid of the error message?
Many thanks
Just in case it helps anyone else the errors disappeared after simply deleting the entire user/.docker folder which is then automatically rebuilt.
If you use VS Code I can highly recommend using a devcontainer.
Have look at the .devcontainer folder of the keras-cv team. Their container does not work out of the box but with minor modifications you can get it up and running.
Modify the devcontainer.json to include “runArgs” with “”–gpus=all" in the config as a minimum (you can read up on “–ipc=host” as it has some security-related concerns):
{
"name": "Keras-cv",
"build": {
"dockerfile": "Dockerfile",
"args": {
// Uncomment this if GPU support it is required
"VARIANT": "-gpu"
}
},
"settings": {
"python.pythonPath": "/usr/bin/python3",
"python.linting.enabled": true,
"python.linting.flake8Enabled": true,
"python.testing.pytestEnabled":true,
"python.editor.defaultFormatter": "ms-python.black-formatter",
"python.editor.formatOnSave": true,
"python.editor.codeActionsOnSave": {
"source.organizeImports": true
}
},
"extensions": [
"ms-python.python",
"ms-python.isort",
"ms-python.black-formatter"
],
"features": {
"git": {
"version": "os-provided"
}
},
"runArgs": [
// "--ipc=host",
"--gpus=all"
],
"onCreateCommand": "locale-gen \"en_US.UTF-8\""
// Optional: install pre-commit hooks
// "postCreateCommand": "git config core.hooksPath .github/.githooks"
}
Then, lastly, the Dockerfile has a small mistake, change this line (version needs to be “2.10.0”):
FROM tensorflow/tensorflow:2.10.0${VARIANT}
Hope this helps.
1 Like
The keras-cv repo is now updated to reflect this. So feel free to copy from there.