The following bit_pack_arg function can be applied as a map function on a tf.data.Dataset:
def bit_pack_arg(*data):
packed_data =
current_byte = 0
bit_count = 0for bit in data[:-1]: #current_byte |= (bit << bit_count) current_byte |= tf.bitwise.left_shift(bit, bit_count) bit_count += 1 if bit_count == 8: packed_data.append(current_byte) current_byte = 0 bit_count = 0 if bit_count > 0: packed_data.append(current_byte) #return np.array(packed_data, dtype=np.uint8) return(packed_data)
I first got the error:
TypeError: unsupported operand type(s) for <<: ‘Tensor’ and ‘int’
on the ‘<<’ operator. After replacing the ‘<<’ operator with tf.left_shift I got the error:
NotImplementedError: Cannot convert a symbolic tf.Tensor (or_7/BitwiseOr:0) to a numpy array. This error may indicate that you’re trying to pass a Tensor to a NumPy call, which is not supported.
I uncommented out the numpy.array call and when bit_pack_arg is mapped to the dataset I now get the output:
[(0,
0,
48,
216,
47,
16,
0,
0,
0,
0,
0,
0,
52,
74,
164,
0,
0,
0,
0,
0,
0,
0,
0,
0),
Close but it is not a nested array of numpy.arrays yet. So I guess what happens is that the dataset creates a symbolic tensor, the ‘+’ operator works on symbolic tensors but ‘<<’ and numpy.array not. So, are these bugs or are there rules for what you can use within tf.data.Dataset map functions? The np.array call converts the packed bytes to np.uint8. How can I convert the packed_data to uint8 without the np.array call?
Thanks,
GW