Is there any method to pad the dataset in tfx pipeline during data ingestion or at any stage in the pipeline? I have a feature in the dataset with varying shapes, due to which it is considered a VarLenFeature (sparse tensors). When I feed this feature into a custom model, converting sparse to dense consumes a lot of memory during training.
This sounds like feature engineering, which should be done in the Transform component. Is there any difference between the raw data in your training dataset and the raw data that you expect to receive when serving the model?