I have been using tfjs in the browser with webGPU backend to train my model and have had some success. However, I am trying to increase the size of my model which understandably has caused the training time to increase dramatically.
To address this I am attempting to move training over to node (specifically the tfjs-node-gpu backend) and my initial results have been confusing. It appears to much slower than what I was doing before in the browser.
I assumed the tfjs-node-gpu backend using CUDA would be faster than webGPU in the browser, is this a correct assumption?
It looks like webGPU can be used in node as well. Should this be faster than the tfjs-node-gpu backend?
Is it possible that tfjs-node-gpu is not properly using my GPU and just putting it all on my CPU? Using task manager it does not appear that the GPU is being stressed nearly as much as it was.
How do I verify that the tfjs-node-gpu backend is using my GPU?
Are there any setting I need to set ensure GPU usage is prioritized in node?
If I have newer versions of CUDA toolkit or cuDNN SDK installed than are recommended could that be causing performance issues that will not generate an error?