Hello community!
I’m new. Looking for tutorials that match a specific use case.
It is a new LANGUAGE of new unique tokens.
Basically : 1,000 new tokens to train.
How best to do that?
ENVIRONMENT: WEB and Node.js.
With a SPEECH to TEXT app, which does not recognize the 1,000 new tokens yet.
GOAL: a STT to recognize NEW tokens.
Imagine medical terminology, of long complex words, of concatenated sub tokens.
Q: Similar to other projects?
Q: What tutorials would be most relevant?
TYSM.