Hello folks,
I have got an AMD GPU on my system and I want to use it for model training. After several efforts I have not been able to do so.
Since CUDA is only supported on NVIDIA GPU, it doesn’t work for me.
Any suggestions?
Thank you.
CUDA is Nvidia proprietary technology, for AMD GPUs you have OpenCL
@Remy_Wehrung
Is OpenGL supported with Tensorflow. If I sit for the TensorFlow certification exam then would it cause any issue?
you are confusing OpenGL, 3D display technology created by the late Silicon Graphics and OpenCL, open source data processing middleware, it has nothing to do with it, see the sites talking about OpenCL
@Remy_Wehrung sorry there was typo with openGL.
Can OPEN CL be used while giving the Tensorflow certification exam?
No, for this certification you need to know the Tensorflow API, Python basics and Linux administration basics, that’s all. Open CL is only useful if you are familiar with C ++. but it is irrelevant for the certification
You need to use TF AMD ROCm
Tensorflow-directml for windows
github / microsoft / tensorflow-directml (cannot include links)
can be used with any DirectX 12 compatible video card.
In my tests on 2080Ti it is only ~68% slower than cuda version
Directx has nothing to do with ML, so that the Microsoft fork (by the way?) Is slower does not surprise me. We are talking about libraries written in python, it is universal as a language.
DirectML is young tech. GPU vendors should optimize their drivers for DirectML .
But at least you can run tf graph models on amd gpu right now.
Also inference on tensorflow-DirectML is only ~30% slower.
For Directml the RFC is still in WIP see the last messages in
/Cc @penporn
Currently I still suggest to use the mentioned ROCM wheel on AMD GPUS
I already use tf-directml in production.
DeepFaceLab has DX12 build to train deepfakes on amd gpus. All works like a charm .
TF 1.x is practically EOL and TF directml is still waiting for:
LMAO
Who did say that?
if you have no skill to make a graph model, it is your problem, not tf 1.15 version.
TF 1.x it Is not actively supported in TF as It is EOL.
Tensorflow Directml is still a fork of TF 1.x. so you can find third party support there.
Or more simply Keras? Maybe I’ll be nerdy, but where’s the “ease” of doing machine learning on windows compared to linux? TF, Keras, scikit learn, pytorch are primarily intended for Linux and this OS has evolved a lot in ergonomics
On Windows I still suggest to use WSL2 when It will be ready (hopefully quite soon):
https://github.com/RadeonOpenCompute/ROCm/issues/794#issuecomment-830767340
On native Windows with Directml when the RFC is finalized and we will have TF2.x support.
Today I don’t suggest to start a new project with TF 1.x
ROCm is not supported on WSL2 and AMD has no plan to support drivers on windows.
currently tf-dml is the best solution to train and inference on AMD cards for windows.
@Remy_Wehrung couldn’t agree more! The power you get with “sudo” is like Thanos with infinity stones.
same for TF 2.x
For an experienced programmer, the ML framework does not matter.
In my experience 95% of time I spent to datasets, researching and UI/app.
Models can be easily ported to any ML framework.