How well is MKL supported by TensorFlow? (now disabled in Java)

karllessard · June 10, 2021, 1:19pm

Hi everyone,

At SIG JVM, we have just decided to stop supporting and building native TensorFlow MKL-enabled artifacts for the following reasons:

At pretty much each new release of TensorFlow, the MKL build is broken on various platforms and it requires some gymnastics on our side to get it work again (if we are even able to).
We did not investigated much on the reasons why but performances with MKL were many times worst than without it.

So that being said, if anyone here has some insights to share about the actual status of MKL in TensorFlow and/or any ideas on how we can continue to support it without trouble, that would be greatly appreciated.

Thanks!
Karl

MaziyarPanahi · June 12, 2021, 5:03pm

Hi @karllessard

Back in TensorFlow 1.12.x I did a benchmark via custom build of TensorFlow with mkl and opt flags. To build the Java part I just followed these instructions: tensorflow/tensorflow/java/README.md at master · tensorflow/tensorflow · GitHub

In my benchmark (training NER) the Intel Cascade Lake w/MKL was close and sometimes better than GPU (using the system’s memory it could have a larger batch size).

That’s being said, I’ve never tried testing the inference. But the training was much faster than a native CPU build on newer CPU architectures.

karllessard · June 16, 2021, 2:13am

Thanks @MaziyarPanahi , can you tell me on which platform (OS) you observed such performances at that time?

MaziyarPanahi · June 16, 2021, 6:24am

Absolutely! These are the platforms I observed improvements:

Dell PowerEdge C4130 - Ubuntu 16.04 LTS - Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz
AWS p3.8xlarge - Ubuntu 18.04 LTS - Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
AWS c5.12xlarge - Ubuntu 18.04 LTS - 2nd generation Intel Xeon Scalable Processors (Cascade Lake)

saudet · June 23, 2021, 12:17am

It looks like the default build of TF Core does have something of “MKL” in it, just not enabled by default:
https://techdecoded.intel.io/resources/leverage-intel-deep-learning-optimizations-in-tensorflow
Setting TF_ENABLE_ONEDNN_OPTS=1 with default builds of TF Java might just do the same as Python.

karllessard · June 24, 2021, 4:48pm

Really, interesting… I’ll give it a try, if anyone does before me please share your benchmarks!

Topic		Replies	Views
TensorFlow Java 1.0.0 Released! Announcements kotlin , java	1	139	December 13, 2024
Acceleration for Java bindings General Discussion gpu , java	1	576	April 21, 2023
Single-sample evaluation (prediction/inference) speed for simple MLP slower in Keras compared to skl Keras models , epoc , tensorflow	6	92	October 30, 2024
Accelerating TensorFlow using Apple M1 Max? General Discussion mac_os , pluggable_device , help_request , apple_silicon	51	57514	August 16, 2023
Tensorflow 2.17 slow on apple silicon when training neural nets TensorFlow tfkeras	3	467	November 13, 2024

How well is MKL supported by TensorFlow? (now disabled in Java)

Related topics