Hi all,
I hope this is the right place to give feedback on something that is amiss in Tensorflow documentation.
The current documentation of matmul explicitly says that
the inner 2 dimensions specify valid matrix multiplication dimensions, and any further outer dimensions specify matching batch size
This is not true anymore, as matmul supports broadcasting since 2019. See some discussion and when the capability was added, that is, since release 1.12.1.
I believe many people lost a lot of time “writing a bunch of boilerplate reshaping code”, because this is not known. I hope someone will update the documentation and save the time of a lot of people from now on. Thanks!