Hi,
I have an issue with user/item pairs training data when training a tensorflow retrieval model, using TFRS.
I have following two datasets:
-
Pairs of user/items ID’s over a specific period. This represents an item which the user has viewed at a point in time.
-
A complete list of items for that same period.
The problem I have is that in a specific period, some items were added and other items removed at certain times. Which meant, depending on the time, users were only able to view part and not all available items.
For example, let’s say you have the following activity of the user and items over a period of time. For this example assume that item_A, item_B and item_C are a similar type of product.
Jan 1 - item_A - listed
Jan 10 - item_B - listed
Jan 10 - user_1 views item_A and item_B
Jan 11 - item_C - listed
Jan 13 - item_A - removed
Jan 14 - user_2 views item_B and item_C
In the example above item_A was not available at the time user_2 viewed items B and C, so it counts item_A as a negative sample, which is not correct.
Is there any particular way to handle these types of issues when creating training data?