hi we have a scala application where we are using tensorflow core platform library to use SavedModelBundle for loading tensorflow graphs and running some operations.
I know that for any Tensor or Ttype we have to close resource, and we are leveraging try-with-resources paradigm for those in scala with a functional library and also scala.util.Using. Part I am confused about is about NDArrays , if i use an api like StdArrays.ndCopyOf to copy say from TFloat32 to FloatNdArray , does this FloatNDArray has jvm based memory allocation and hence garbage collected ? I know tensors normally have to be closed to release underlying C based tensor memory. Please advise.
thanks
Preet
Hi Preet,
Yes, NdArrays other than TF tensors are allocated in the JVM memory or are backed by direct buffers that are tracked and released by the garbage collector. Hence, it is not required that you protect them using try-with-resource blocks.
Karl
On the other hand, if you can provide an example of how you are copying your arrays that could help, because StdArrays.ndCopyOf
are normally used to create an NdArray copy of a standard Java arrays but not tensors like a TFloat32. NdArray.copyTo
should be used for this, is this what you meant initially?
hi Karl, yes, i am using StdArrays.ndCopyOf to copy java array to NDaRRAY - I am not using Tensor TNumber api for it, however i am curious in some cases i have to take a say a scala Array[Array[Float]] and want to convert to TFloat32 tensor and i end up doing two steps always like this
a) convert primitive array to NdArray using StdArrays.ndCopyOf
b) then use TFloat32.tensorOf overriden api to copy ndarray into a Tensor, is there a direct way to create a tensor from java primitive arrays ?
fyi, scala Array is same thing as Java Array so there is no hop there.
also i am curious how much of a performance hit we take in an application copying two or 3D arrays back forth to NDArray - is there usually just a deep copy of data or do NDArrays do use performance techniques to avoid taking that hit.
is there a direct way to create a tensor from java primitive arrays ?
I have to admit that it was a deliberate choice for not adding such endpoint. The reason behind this is that manipulating [2,n]-dimensional Java arrays is inherently slow, as they do a lot of dereferencing, and if performances matter they should be avoided. The idea is to start storing the data directly in a NdArray
instead (the java-ndarray
library does not depend on TensorFlow and can be used in any part of the project).
Now the good news is that if you want to keep using Scala multidimensional arrays and avoid this extra copy, you can write this logic by replicating the code of StdArrays.ndCopyOf
and just replace the NdArrays.ofFloats(shape)
by TFloat32.tensorOf(shape)
. All methods are publicly accessible.
And just to add up also, while your current process (creating first an NdArray from a standard array, then copying it to a tensor) may have a space complexity of O(2n), the time complexity is constant as the content of the NdArray will basically be copied using a single memcpy
to the tensor memory (unless you use some fancy data types that needs some conversion but with floats, you are fine).
thanks Karl, this helps a lot. I will also see if we can avoid using java multi-dimensional array and just NDArray wherever possible.
1 Like