I have defined a map function that unpacks 6 32-bit integers into 192 (1/0) integers:
def unpack(x, y):
unpacked_data = tf.TensorArray(tf.uint32, size=0, dynamic_size=True)
for b in x:
for _ in range(32):
unpacked_data = unpacked_data.write(unpacked_data.size(), b & 1)
b = tf.bitwise.right_shift(b, 1)return x, unpacked_data.stack(), y
#for training I would return unpacked_data.stack(), y
The map function works:
a=np.array([0, 0, 0, 16, 45, 57],dtype=np.uint32)
b=unpack(a, a)
print(b)
returns:
(array([ 0, 0, 0, 16, 45, 57], dtype=uint32), <tf.Tensor: shape=(192,), dtype=uint32, numpy=
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1,
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=uint32)>, array([ 0, 0, 0, 16, 45, 57], dtype=uint32))
Now I want to apply this map function to the features of a batched dataset:
with tf.device(“CPU”):
train = tf.data.Dataset.from_tensor_slices((x_train, y_train)).batch(BATCH_SIZE)
#for training I would use 4 * BATCH_SIZE
validate = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(BATCH_SIZE)
x_train and y_train are NumPy arrays. But when I apply the map function to the ‘train’ dataset:
train = train.map(unpack)
list(train.as_numpy_iterator())
The shape of the bit-array is wrong: it is transposed, so not 192x1, but 32x6:
[(array([[ 0, 0, 0, 16, 45, 57]], dtype=uint32),
array([[0, 0, 0, 0, 1, 1],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0],
[0, 0, 0, 0, 1, 1],
[0, 0, 0, 1, 0, 1],
[0, 0, 0, 0, 1, 1],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]], dtype=uint32),
array([1.])),
Why?
Regards,
GW