Hi there,
I am curious about TFG’s designed representation of tensors. From the code, I found that tensor is represented as an attribute of the tfg.op. The code is like below:
// tensorflow/core/ir/importexport/convert_tensor.h
// Converts an TensorFlow tensor proto into an MLIR elements attribute.
tensorflow::StatusOr<ElementsAttr> ConvertTensorProto(
const tensorflow::TensorProto& input_tensor, Builder builder,
TFGraphDialect* tfgDialect);
From the above code, a tensorflow tensor is converted to mlir::ElementsAttr for a specific op. This would be an attribute for the op node.
So, from my understanding, tfg does not allow modifying the tensor’s data, because the tensor attribute should be constant value.
In that case, Are the following understanding correct?
tfg by design does not allow optimizations like constant_folding, which would modify tensor’s data. (Or have to recreate a new node and replace the old one).
Why not create a type instead of attribute in TFG MLIR? As type would allow mutable part listed in the link.
Similar design appreas on ShapeAttr , it also does not have the setShape method, meaning this by design could not be modified.
As tfg is designed to replace grappler, in grappler, it could easily change those things in NodeDef, how could tfg do these things? Should tfg always create a new node and replace the old one?
The attribute is immutable, but the node has a mutable dictionary of attribute (this isn’t specific to TFG, this is standard MLIR): so you can just update the attribute on a node.
That said, when you do constant folding, in general you replace some nodes with a constant node so you have to recreate a new node and replace the old one regardless.
Types aren’t mutable, the exception you’re pointing at is only intended to be able to implement recursive types (you need to create the Type and then mutate it to add a reference to itself).
Regardless, I’m not sure what is the problem you’re trying to solve with this?
As mentioned above, you can’t change the content of an attribute in-place, but you swap individual attributes on an operation. Similarly you can swap the type of individual results of an operation in place. Doing so you don’t modify a type itself, you create a new type and swap it from the old type.
In general MLIR is much more efficient than the proto representation used by Grappler.
If you haven’t seen it yet, I invite you to look into the MLIR tutorial (slides - recording - online step-by-step) as well as this doc that explains the in-memory representation of the IR and in particular the def-use chains.
The problem I am trying to solve is to implement a graph optimization pass based on TFG, similar to GSPMD does for XLA. This optimization pass would add additional information to TFG’s node, and this information would be changed during the optimization process. This info could be something like “estimated split strategy for current op”. Let’s call this split_strategy for simplicity. I would wish this split_strategy flow through the whole graph, thus every op node would have this split_strategy.
To do that, I am looking forward to define a dialect called toy , this dialect is based on tfg, and it wraps mlir::TensorType to a toy::TensorType. This toy::TensorType would have an additional field split_strategy. This split_strategy would change frequently when searching for an optimal one. Thus recreating the node would be something too expensive.
As you mentioned, if the types are not mutable in general, then could you tell me a little more the use case of how to bind mutable information to the tfg tensor as an attribute? I mean, could you give a little more hint on how to sway individual attributes? By defining the the attribute as a pointer?
I thought converting a mlir::ElementsAttr(tfg converted tensor type) to toy::TensorType (toy dialect converted from tfg) dialect and do whatever on that type is a a good choice. If the mutable type is not intended , then what should I refer if I wish to frequently update information binded to an tensor attribute?
If you’d like to carry information on the nodes, you can just add attributes to the node themselves freely: that’s quite cheap to do.
If you rather model this on the types themselves, then you’d just re-create a new type every time you want to modify the “split_strategy” and set it on the actual “edge” in the graph (a Value in MLIR terminology).
The actual mechanism depends on how is this “split_strategy” represented? Is this an enum or a more complex datastrutture?
It should be a complex data structure containing something like 1D array representing shapes, 2D array representing devices, etc. Thus it is a struct.
In that case, it seems that carrying information on Types is expensive, since I need to recreate every time it changes. So I have to attach a pointer-like attribute, and mutate that underlying information, is this correct?
You saved my day! Thank you very much for your help!
You really can’t mutate Type and Attribute safely: they are stored in a map and hashed. Every operation (node…) using a given type will reuse the same instance. When you “recreate” a Type what happens is actually hashing and lookup to see if it already exists, in which case it gets returned.
In general, if you need to just compute transient information in a transformation, you may not store them directly by mutating the IR, you may keep a map on the side from operations to “split_strategy” and use this as your temporary storage. It does not prevent you from materializing the “split_strategy” as type/attribute annotation when you’re done, but that shouldn’t be too heavy any more at this point.