I am trying to set up a GB classifier on TFX for the first time. I am able to run smoothly with the interactive context. However when trying to run it on Kubeflow on GCP, I am having an issue with the transform component. Here are a few questions, which could hopefully solve the problems I am encountering:
What is the recommended environment and/or VM to set up TFDF with TFX?
When creating the pipeline on Kubeflow (with CLI: tfx pipeline create […]), wheels are being created for the transform and trainer component in a /tmp folder in (such as tfx_user_code_Transform-[…]). What purpose are those wheels serving and how to indicate where to store/retrieve them? I dont exactly know why and my set up might be wrong but the Transform component, when ran on Kubeflow, is looking for those wheels in the wrong location ( ings://[…]/[…]/_wheels/[…]). More context is given here
Is there any code snippet or architecture code that we could follow to set up a TFDF model on TFX. Or a estimator.GradientBoostedClassifier that I can’t seem to be able to set up properly.
Hi @Robert_Crowe thanks for posting this. Lines 34- 36 are particularly interesting:
flags.DEFINE_enum('model_framework', 'keras',
['keras', 'flax_experimental', 'tfdf_experimental'],
'The modeling framework.')
I’m not familiar with how flags influence TFX - does specifying tfdf_experimental signal something special to either Google AI platform or the VertexAI platform when the pipeline is executed there?
Nope, it’s just command line flags. If you search through the code you can see how it influences the name of the pipeline and which module file is selected.