Hello Tensorflow’s Community;
While I was using TFDS module, I was confused about its memory management. I have the following small code block:
train_ds = tfds.load("cifar10", split="train")
test_ds = tfds.load("cifar10", split="test")
train_ds = train_ds.repeat(num_epochs).shuffle(1024)
train_ds = train_ds.batch(batch_size, drop_remainder=True).prefetch(1)
for sample in tfds.as_numpy(train_ds):
image, label = sample['image'], sample['label']
print(image.shape)
When we call tfds.load(.)
function, we create a builder, download the data, prepare it, and return it as tf.data.Dataset
as far as I know. What I am wondering is whether the samples (images and labels) are also loaded into RAM when we use tfds.load()
? If not in the RAM now, when will be it loaded into RAM ? Is it loaded during batching and prefetching or during iteration ?