Inference Code
Hey . Its a great work! but i want to know if there has infernce code like onnx-community/chatterbox-ONNX? Thank U!!
can we use gpu for inference?
@ozguntosun Hi, there is a link in readme into conversion script, you could export with cuda device instead of cpu and use it in inference, it should be easy enough
Thank you, i converted for gpu and its working. Another update required for new model update realeased for multilanguage tokenizer. It makes big improvements for inference non english languages. We need to load new t3 model and new tokenizer(https://github.com/resemble-ai/chatterbox/blob/bf169fe5f518760cb0b6c6a6eba3f885e10fa86f/src/chatterbox/mtl_tts.py#L184). Can you help for edit conversion and inference for this new pipe?
Probably it's needed just to replace tokenizer config, I'll take a look a bit later
Just changing the tokenizer is not enough to improve audio quality — I tried this already. For the LLaMA backbone and inference, the EXAGGERATION_TOKEN is no longer required as a parameter; it now has a different processing logic. When I used the updated backbone from the latest model, I was able to achieve the quality I wanted.
there is also updated backbone in the repo, at least from the recent version 1.4.0. What exactly you try to achieve and where is the difference? What do you mean exaggeration token is not required, we use it to control tensity of the speech and it is used in original code to prepare conditionals.