Hi, all.
I have a question about TFDV (tensorflow data validation) and tfx.
Let me explain my case first. I want to pass custom generators and calculate custom statistics without re-writing StatisticsGen (tfx.v1.components.StatisticsGen | TFX | TensorFlow).
So I checked StatisticsGen can get the stats_option
parameter which has the type Optional[tfdv.StatsOptions]
and tfdv.StatsOptions
can get custom generators via the generators
parameter (tfdv.StatsOptions | TFX | TensorFlow).
But as we can see below links, we cannot pass the generators
parameter to StatisticsGen’s executor, since StatisticsGen cannot serialize custom generators in tfdv.StatsOptions
.
- StatisticsGen serialize StatsOption via
StatsOption.to_json
method tfx/tfx/components/statistics_gen/component.py at 17991a0429de5126400ee87ca6dd633b3da5a68c · tensorflow/tfx · GitHub - to_json method of StatsOption class: data-validation/tensorflow_data_validation/statistics/stats_options.py at 597e2fda54d8bd30e76dc60dc58487d1b8780b2f · tensorflow/data-validation · GitHub (Custom generators are skipped)
In this situation, how can I pass custom generators to StatisticsGen and run with them?
Thank you