Use AMD GPU for model training

Rohan_Raj · June 16, 2021, 2:53am

Hello folks,
I have got an AMD GPU on my system and I want to use it for model training. After several efforts I have not been able to do so.
Since CUDA is only supported on NVIDIA GPU, it doesn’t work for me.
Any suggestions?
Thank you.

Remy_Wehrung · June 16, 2021, 4:23am

CUDA is Nvidia proprietary technology, for AMD GPUs you have OpenCL

Rohan_Raj · June 16, 2021, 4:49am

@Remy_Wehrung
Is OpenGL supported with Tensorflow. If I sit for the TensorFlow certification exam then would it cause any issue?

Remy_Wehrung · June 16, 2021, 5:05am

you are confusing OpenGL, 3D display technology created by the late Silicon Graphics and OpenCL, open source data processing middleware, it has nothing to do with it, see the sites talking about OpenCL

Rohan_Raj · June 16, 2021, 5:20am

@Remy_Wehrung sorry there was typo with openGL.
Can OPEN CL be used while giving the Tensorflow certification exam?

Remy_Wehrung · June 16, 2021, 5:42am

No, for this certification you need to know the Tensorflow API, Python basics and Linux administration basics, that’s all. Open CL is only useful if you are familiar with C ++. but it is irrelevant for the certification

Bhack · June 16, 2021, 9:12am

You need to use TF AMD ROCm

Deepfakescovery · June 18, 2021, 7:28am

Tensorflow-directml for windows

github / microsoft / tensorflow-directml (cannot include links)

can be used with any DirectX 12 compatible video card.

In my tests on 2080Ti it is only ~68% slower than cuda version

Remy_Wehrung · June 18, 2021, 7:57am

Directx has nothing to do with ML, so that the Microsoft fork (by the way?) Is slower does not surprise me. We are talking about libraries written in python, it is universal as a language.

Deepfakescovery · June 18, 2021, 9:05am

DirectML is young tech. GPU vendors should optimize their drivers for DirectML .
But at least you can run tf graph models on amd gpu right now.
Also inference on tensorflow-DirectML is only ~30% slower.

Bhack · June 18, 2021, 3:30pm

For Directml the RFC is still in WIP see the last messages in

github.com/tensorflow/community

RFC: TensorFlow on DirectML

tensorflow:master ← wchao1115:master

opened 06:24AM - 12 May 20 UTC

wchao1115

+116 -0

This RFC will be open for comment until Monday, May 25th, 2020. | Status … | Proposed | :-------------- |:---------------------------------------------------- | | **RFC #** | [243](https://github.com/tensorflow/community/pull/243) | | **Author(s)** | Chai Chaoweeraprasit (wchao@microsoft.com), Justin Stoecker (justoeck@microsoft.com), Adrian Tsai (adtsai@microsoft.com), Patrice Vignola (pavignol@microsoft.com) | | **Sponsor** | Penporn Koanantakool (penporn@google.com) | | **Updated** | 2020-06-08 | ## Objective Implement a new TensorFlow device type and a new set of kernels based on [DirectML](https://docs.microsoft.com/en-us/windows/win32/direct3d12/dml-intro), a hardware-accelerated machine learning library on the DirectX 12 Compute platform. This change broadens the reach of TensorFlow beyond its existing GPU footprint and enables high-performance training and inferencing on Windows devices with any DirectX12-capable GPU.

/Cc @penporn

Currently I still suggest to use the mentioned ROCM wheel on AMD GPUS

Deepfakescovery · June 18, 2021, 5:44pm

I already use tf-directml in production.
DeepFaceLab has DX12 build to train deepfakes on amd gpus. All works like a charm .

Bhack · June 18, 2021, 5:48pm

TF 1.x is practically EOL and TF directml is still waiting for:

https://github.com/microsoft/tensorflow-directml/issues/107

Deepfakescovery · June 18, 2021, 6:48pm

LMAO
Who did say that?
if you have no skill to make a graph model, it is your problem, not tf 1.15 version.

Bhack · June 18, 2021, 9:47pm

TF 1.x it Is not actively supported in TF as It is EOL.
Tensorflow Directml is still a fork of TF 1.x. so you can find third party support there.

Remy_Wehrung · June 19, 2021, 1:50am

Or more simply Keras? Maybe I’ll be nerdy, but where’s the “ease” of doing machine learning on windows compared to linux? TF, Keras, scikit learn, pytorch are primarily intended for Linux and this OS has evolved a lot in ergonomics

Bhack · June 19, 2021, 1:58am

On Windows I still suggest to use WSL2 when It will be ready (hopefully quite soon):

https://github.com/RadeonOpenCompute/ROCm/issues/794#issuecomment-830767340

On native Windows with Directml when the RFC is finalized and we will have TF2.x support.

Today I don’t suggest to start a new project with TF 1.x

Deepfakescovery · June 19, 2021, 3:08am

ROCm is not supported on WSL2 and AMD has no plan to support drivers on windows.

currently tf-dml is the best solution to train and inference on AMD cards for windows.

Rohan_Raj · June 19, 2021, 3:09am

@Remy_Wehrung couldn’t agree more! The power you get with “sudo” is like Thanos with infinity stones.

Deepfakescovery · June 19, 2021, 3:13am

same for TF 2.x
For an experienced programmer, the ML framework does not matter.
In my experience 95% of time I spent to datasets, researching and UI/app.
Models can be easily ported to any ML framework.

Topic		Replies	Views
Is it possible to build tf with pluggable devices plugin? #help_request #pluggable_device General Discussion pluggable_device , windows , help_request	16	2386	January 4, 2023
Tensorflow GPU Not working on Windows General Discussion gpu , windows	11	25327	July 1, 2024
2.10 last version to support native Windows GPU General Discussion gpu , help_request	11	13926	March 26, 2024
Help needed: TensorFlow Not detecting my GPU General Discussion gpu , tensorflow	8	219	June 10, 2024
Can't see the GPU on a laptop, ubuntu under WSL General Discussion gpu	8	6317	January 7, 2023

Use AMD GPU for model training

Related topics