I was looking for a model to process French sentences, but I can’t find any for TF.js. So, using tensorflowjs_converter, I tried to convert the universal-sentence-encoder-multilingual model (Google | universal-sentence-encoder | Kaggle), but it’s not working.
I get an error “Op type not registered ‘SentencepieceOp’ in binary running”
Is there an existing multilingual model available for TF.js or a way to make it work?
This is also an interesting topic for tfhub about how to handle ecosystem dependencies in tfhub models like in this case when we need to use the model with the converter.
I haven’t found doc on how to use tfjs.converters directly, but I was able to go beyond the tensorflow_text with the following code (based on the logic of the CLI converter):
import tensorflow as tf
import tensorflowjs as tfjs
import tensorflow_hub as hub
import tensorflow_text
from tensorflowjs.converters import tf_saved_model_conversion_v2
tf_saved_model_conversion_v2.convert_tf_hub_module(
"https://tfhub.dev/google/universal-sentence-encoder-multilingual/3",
"web_model",
signature="serving_default"
)
However, I now get the following error:
ValueError: Unsupported Ops in the model before optimization
SentencepieceOp, SegmentSum, RaggedTensorToSparse, ParallelDynamicStitch, SentencepieceTokenizeOp, DynamicPartition
It seems that this multilingual model uses different operators than the universal sentence encoder provided of TF.js model’s page.
Xel, Another issue you might find later, given you manage to convert, is that this model is a little bit for a webpage (> 200MB).
You might have to take that into account too.
I don’t know if we could add somewhere machine readable metadata related to dependencies as this will be better for the other ecosystem tools or any automation.
Also this could be consumed for creating a specific dependencies section in the TFHUB model webpage.
XeL, TensorFlow.js unfortunately doesn’t have support for those ops yet, but you might be able to convert the model to a TFLite saved model and then use tfjs-tflite, which runs TFLite models in the browser using webassembly.
Good news! I was able to compile it using TensorFlow Lite. I’ll test it out, but as @Igusm point it out, it weight 278 MB, so I guess I’ll have trouble using it on the web.
It’s really hard to find pre-trained models for languages other than English.
Off the top of my head - have you tried any of the quantization techniques for model size reduction mentioned in in the TensorFlow Lite docs? I hope some of the following stuff helps:
Also, in case you haven’t checked this out - there are TensorFlow Lite Model Maker guides and tutorials specifically for NLP (QA and classification) (cc @billy):
Very good tip @8bitmp3 , these techniques might be able to help with the mode’s size!
You might lose a little bit of accuracy but it’s well worth to try!
Hi. I’m trying to use in NodeJS the model “universal-sentence-encoder-multilingual” (located at “Google | universal-sentence-encoder | Kaggle”), which has support for Spanish language, but when I try to load it with “tfjs-node v.4.2.0” library (on both Windows and Linux operating systems, using TF2.0 Saved Model v3 format) I get the following error related to “SentencepieceOp”:
“E tensorflow/core/grappler/optimizers/meta_optimizer.cc:903] tfg_optimizer{} failed: NOT_FOUND: Op type not registered ‘SentencepieceOp’ in binary running on LAPTOP. Make sure the Op and Kernel are registered in the binary running in this process.Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) tf.contrib.resampler should be done before importing the graph , as contrib ops are lazily registered when the module is first accessed.
While importing FunctionDef: __inference_pruned_162942
when importing GraphDef to MLIR module in GrapplerHook”.