Why should I set 'steps_per_epoch' manually?

blackdove0430 · August 23, 2023, 5:59am

Hi, there.
I’m new.
I’m using TF2 to train LSTM model to predict stock price.
It sounds crazy, but I just want to take a try.

I have a basic question about parameter ‘steps_per_epoch’ of model.fit().
As we know, if we shuffle dataset using repeat() function,
then we must set some values to ‘steps_per_epoch’.
So, here is the quesion.
I set samples’s dataset to TF,
I set batch_size to TF,
and I still have to set ‘steps_per_epoch’ manually by myself.
It is unreasonable.
Since TF2 knows “len(dataset_train)” and “batch_size”,
why can’t TF2 calculate ‘steps_per_epoch’ using “len(dataset_train) / batch_size”,
and set “len(dataset_train) / batch_size” to ‘steps_per_epoch’ automatically by TF2 itself?

code are blow.

…
self.dataset_train.cache().shuffle(buffer_size).batch(self.batch_size, drop_remainder=True).repeat()
…
self.steps_per_epoch = 200
self.hist = self.model.fit(self.dataset_train,
epochs=self.epoch,
steps_per_epoch=self.steps_per_epoch,
validation_data=self.dataset_test,
validation_steps=self.validation_steps,
verbose=self.verbose, callbacks=tensor_board)

Kiran_Sai_Ramineni · August 23, 2023, 9:55am

Hi @blackdove0430, When training with input tensors such as TensorFlow data tensors, the default None is equal to the number of samples in your dataset divided by the batch size, or 1 if that cannot be determined.

If x is a tf.data dataset, and steps_per_epoch is None, the epoch will run until the input dataset is exhausted.

When passing an infinitely repeating dataset, you must specify the steps_per_epoch argument.

Thank You.

blackdove0430 · August 24, 2023, 5:45am

Hi @Kiran_Sai_Ramineni

Thank you very much.
I think I understand now.

when using repeat() function, it means repeat forever(dataset will never exhausted),
so we must specify the steps_per_epoch some values to end the training.
On the other hand, if do not use repeat() function,
TF2 will calculate ‘steps_per_epoch’ by using “len(dataset_train) / batch_size” itself.

Here comes the next quesion.
In order to get a better training result, what kind of appropriate number should be set to ‘steps_per_epoch’?
For example,
“num_1 = len(dataset_train) / batch_size”
a = num_1
b = num_1 * 2
c = num_1 * 3
d = num_1 * 10

a,b,c,d, which one will be the best value.
I will do some test to see what will happen.

Kiran_Sai_Ramineni · August 28, 2023, 8:07am

len(dataset_train) / batch_size should be good for steps_per_epoch. Thank You.

blackdove0430 · September 2, 2023, 5:54am

Thank you so much.
Thank you for your answer.

tbhaxor · September 2, 2023, 9:16am

Yes you are correct.

So you know what repeat() operation does.

Now lets say you have fixed length data no repeat. You defined batch size 10, and your entire dataset is 100 entries, therefore each epoch will consume whole data set of 100 entries and since batch size is 10, it will do this operation in 10 steps, which is what steps_per_epoch mean.

Now lets say your batch size is 32, and you used self.steps_per_epoch = 200, it means total 32 * 200 (6400) rows / entries from the repeated dataset will be used to train the model in every epoch.

Topic		Replies	Views
Dataset : repeat() General Discussion help_request	2	3917	August 23, 2023
LSTM forecasting tensorflow use of batch, repeat and shuffle General Discussion datasets , help_request	1	1137	July 19, 2022
Regenerate dataset after n epochs General Discussion datasets	2	518	September 21, 2023
Best way to choose steps_per_execution? General Discussion api , keras , model	1	672	July 24, 2024
tf.data.Dataset varies at re-iteration. Manual reset possible? General Discussion datasets , keras , help_request	6	2036	September 3, 2022

Why should I set 'steps_per_epoch' manually?

Related topics