Why TensorFlow documentation uses np.random.rand()
function to create train test split of the dataset? For example:
def split_dataset(dataset, test_ratio=0.30):
"""Splits a panda dataframe in two."""
test_indices = np.random.rand(len(dataset)) < test_ratio
return dataset[~test_indices], dataset[test_indices]
Above code snippet is copied from Automated hyper-parameter tuning. We can’t say with 100% guarantee that this code will split 30% of the dataset as test and remaining 70% as training since the process is dependent on random number generation without any initial seed. Then why the official documentation uses numpy random number generation to split the dataset?