Are the tf.data.experimental.CsvDataset
available in (https://methods)) enough to perform data wrangling and transformation usually done using libraries like scipy, numpy and pandas? . for example, i have been trying to perform an equivalent of the below line of pandas code using dataset methods but couldn’t find the suitable one for it. Looking forward to your reply.
sum_of_features = {}
for col in features:
sum_of_features[col] = sum_ofs[col].groupby(["year", "month", "day", "second"]).sum(col)
Do you mean this page?
It refers to a “richer” API here:
And has a tutorial here:
yes, i meant tf.data.experimental.CsvDataset | TensorFlow v2.16.1. can the APIs be used for achieving the data wrangling code i posted above? i haven’t found the right API for doing such
Apparently TF has its own “Pandas” called “Tensorflow Transform”. I haven’t used it.
You can do feature engineering (some folks call it wrangling, others say that wrangling is different) in both TensorFlow Transform and Keras Preprocessing Layers. Transform is a more scalable solution and is well-integrated into TFX, so if your dataset is large or you need to do a lot of processing it’s a better choice. KPL is of course more tightly integrated into Keras.