I was wondering what the difference between these two classes were? Can I just directly use TextVectorization in place of using the Tokenizer?
Both have different purposes and use cases. If you’re building an end-to-end deep learning model for a specific NLP task, TextVectorization
is usually more convenient because it handles the tokenization
and vectorization
in one step and can be easily integrated into your model as a layer.
Where as Tokenizer
may be more appropriate if you require more control over the tokenization
process.
Thank you!