I am trying to bring together the following tutorials:
- Creating decision tree by hand
- Custom layers via subclassing
- Composing Decision Forest and Neural Network models
The goal is to 1. Create a custom tree, 2. Embed it into a custom layer and 3. combine it in a model with other layers.
The problem is that in step 1. by using the RandomForestBuilder
, the model is serialized and deserialized resulting in object of type keras.saving.saved_model.load.CoreModel
However, the tutorial in step 3. embeds the tree layer via tfdf.keras.RandomForestModel
Ideally, the custom layer would create the custom tree by calling RandomForestBuilder
in its constructor, however, this is not straightforward given the exporting and loading of the model.
Step 1:
builder = tfdf.builder.RandomForestBuilder(
path="/tmp/manual_model",
objective = tfdf.py_tree.objective.RegressionObjective(label='tree_result')
)
Tree = tfdf.py_tree.tree.Tree
SimpleColumnSpec = tfdf.py_tree.dataspec.SimpleColumnSpec
ColumnType = tfdf.py_tree.dataspec.ColumnType
RegressionValue = tfdf.py_tree.value.RegressionValue
NonLeafNode = tfdf.py_tree.node.NonLeafNode
LeafNode = tfdf.py_tree.node.LeafNode
NumericalHigherThanCondition = tfdf.py_tree.condition.NumericalHigherThanCondition
CategoricalIsInCondition = tfdf.py_tree.condition.CategoricalIsInCondition
tree = Tree(
NonLeafNode(
condition=CategoricalIsInCondition(
feature=SimpleColumnSpec(name='country', type=ColumnType.CATEGORICAL),
mask=['US'],
missing_evaluation=False
),
pos_child = LeafNode(value=RegressionValue(value=0.5)),
neg_child = LeafNode(value=RegressionValue(value=0.6))
)
)
builder.add_tree(tree)
builder.close()
custom_tree = tf.keras.models.load_model("/tmp/manual_model")
Step 2:
class CustomTree(tf.keras.layers.Layer):
def __init__(self, custom_tree):
super(CustomTree, self).__init__()
self.custom_tree = custom_tree
def call(self, inputs):
return self.custom_tree(inputs)
input_layer = tf.keras.layers.Input(shape=(None,), name='country', dtype=tf.string)
output_layer = CustomTree(custom_tree)(input_layer)
model = tf.keras.models.Model(input_layer, output_layer, name='SomeModel')
model.predict(tf.data.Dataset.from_tensor_slices(
{'country': ['US','UK']}
).batch(1))
The above gives error for the structure of the input layer and if the former is omitted gives error at model.predict()