Hello, I’ve recently changed to a local machine to be able to run some experiments in my local machine that is a RTX 3090. Initially, I followed this video to perform the tensorflow installation for GPU on my machine: https://www.youtube.com/watch?v=GFTlcKzhmoE.
As far as model training is concerned, tensorflow 2.10 works just fine and i was successfully able to train my model using my GPU. I intend to quantize my model and deploy it to a Raspberry Pi with a Coral USB device, therefore I am required to quantize the model to int8. So, I tried installing the latest version by doing
pip install tensorflow-model-optimization
There was no apparent error to the installation. However, whenever I try to run any kind of optimizations e.g qat, pqat. I get the following errors below:
(0) UNIMPLEMENTED: Determinism is not yet supported in GPU implementation of FakeQuantWithMinMaxVarsGradient.
[[{{node gradient_tape/model_1/quant_output/MovingAvgQuantize/FakeQuantWithMinMaxVarsGradient}}]]
(1) CANCELLED: Function was cancelled before it was started
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_36548]
I then tried following a WSL2 installation tutorial for tensorflow 2.13 which also worked with my GPU, but the error remained the same when trying to run tensorflow-model-optimization on my GPU. I also tried installing older versions of tfmot and still got the same errors. This is weird, because a year ago, I was using Google Colab and was able to quantize the model, but I am unable to do it on my local GPU.
This is a list of layers that my model has:
block1_conv1
- Convolutional layerblock1_conv2
- Convolutional layerblock1_pool
- Pooling layer (no weights)block2_conv1
- Convolutional layerblock2_conv2
- Convolutional layerblock2_pool
- Pooling layer (no weights)block3_conv1
- Convolutional layerblock3_conv2
- Convolutional layerblock3_conv3
- Convolutional layerblock3_pool
- Pooling layer (no weights)block4_conv1
- Convolutional layerblock4_conv2
- Convolutional layerblock4_conv3
- Convolutional layerblock4_pool
- Pooling layer (no weights)block5_conv1
- Convolutional layerblock5_conv2
- Convolutional layerblock5_conv3
- Convolutional layerdense
- Fully connected layerflatten_2
- Flattening layer (no weights)input_3
- Input layer (no weights)output
- Output layer
There doesn’t appear to be any layers incompatiblefrom the looks of it.
My package versions are:
TensorFlow version: 2.10.0 (with GPU support)
Tensorflow Model Optimization: 0.7.5
CUDA version: 64_112
cuDNN version: 64_8