Eigen bfloat16 GEMM in TF

Chip_Kerchner · September 13, 2023, 1:14pm

I am interested in trying to get Eigen bfloat16 GEMM to work in TF - similar to float32/64. For most architectures there are issues with the accumulators in bfloat16 GEMM. These cause rounding and over/underflow problems. But for PowerPC (Altivec/VSX), I have specialized code in Eigen to accumulate in float32 - plus for Power10 it uses hardware accelerated instructions (MMA) for increased performance. Unfortunately when I try to run the unit tests, it never calls bfloat16 GEMM in Eigen. Could someone help me with how I can tie in this new GEMM (bfloat16) into TF for PowerPC only?

Kiran_Sai_Ramineni · September 15, 2023, 10:07am

Hi @Chip_Kerchner, At present TensorFlow binaries use AVX instructions. As you are using MMA instruction it is a feature request type of issue. Please post a feature request type of issues in GitHub. Thank You.

Chip_Kerchner · September 19, 2023, 10:02pm

The code shouldn’t be specific to PowerPC (or MMA) - though it will only be included for this platform in Tensorflow.

I thought someone might know how the matrix multiply routines for Eigen are tied into Tensorflow - whether it is a single place or multiple. With that knowledge, I can make the changes myself (hopefully).

Topic		Replies	Views
Tf sqrt computations are different on two diff CPU architectures (Intel and AMD) General Discussion tf-sqrt , help_request	2	1362	December 20, 2022
No OpKernel was registered to support Op ‘MaxPool3D’ General Discussion github , python , tensorflow	2	437	December 8, 2023
Compute generalised eigenvectors General Discussion help_request	7	1381	January 12, 2023
How do I know if my hardware supports bf16? TensorFlow help_request , tfcore	1	2919	April 28, 2022
Version incompatibility General Discussion tf-version , tfcompat	1	246	March 19, 2024

Eigen bfloat16 GEMM in TF

Related topics