I noticed that the MovieLens dataset, user_rating, is not balanced as attached below. In fact it is just a sample code to illustrate how it works.
Thus, in real world, should i make it balanced dataset instead, or the TensorFlow recommender deals with it behind the scene?
Hi @Takashi_Futada, you have to make it a balanced dataset by doing the necessary preprocessing. If you use the imbalanced dataset for training the model may be biased towards the class containing more samples.
If you want to train the model on an imbalanced dataset you can use a class weights
method, where you will assign different weights to different classes during training. These will cause the model to pay more attention to examples from an under-represented class.
Thank You.