Комментарии:
I loved this type of content Thorsten. You made it so easy for me to test some TTS models I wanted to for using in some home automation projects. You are the best Thorsten, thank you so much 🎉 🎉🎉
ОтветитьIs PiperTTS still the best to do training?
ОтветитьI'd like to hear more about 'integration' or TTS... for reading text...not just for amusing myself, by cloning my voice.
ОтветитьI remember back in 1995, using the MAC TTS for the first time at the age of 12. That sense of wonder and awe... you took me back there... Thank you Thorsten!
Ответитьdevelopers who are do open source... they don't know they might change someone live into better living... i got blind friend it is never been so happy moment for her listening humanlike speech... she said maybe someday she could get a emotional speech driven by context paragraph it read, she said imagine if she reading (listening) a novel with automatic switching voice and emotionally accurate referred by the story...
ОтветитьWhole video is pretty pointless, can't find out which one is better, cloning your foreign accent doesn't help much too and the programming language/OS isn't useful (would be better to know if it uses CPU/CUDA/METAL and how fast is its inference)... Try cloning the voice of the Professor in Futurama.
Your T-shirt sums it up.
Thank you so much for your effort in this, it really helped me ❤
ОтветитьHallo Thorsten und ein herzhaftes Mopn, Moin, aus dem Norden und Danke für dieses Video! Bezüglich deiner Frage, was ich als Bestandteil deiner geplanten Folgen zu den jeweiligen TTS-Systemen gern hören/sehen würde: Für mich (und wahrscheinlich auch viele andere) wäre interessant, wie sich die jeweiligen Modelle in lokale Desktop-Anwendungen (wie etwa Open-WebUI, Text-Genneration-WebUI., LM-studio, Koboldcpp, etc.) einbinden lassen, bzw. ob das überhaupt möglich ist. Da du dich in deinen Videos häufig mit der Thematik lokal laufender Annwendungen auseinandersetzt, dürfte dies wohl sowieso ein naheliegendes Thema sein...
Hello Thorsten! Greetings from the north of Germany and many thx for this video! Regarding your question about what I'd like to see covered in the upcoming videos about the the different TTS-models, that you're planning to create: I guess it's not only me who would be interested in how it will be possible (or if, anyway) to integrate those TTS-engines into desktop-apps running LLM's locally like: Open-WebUI, Text-Generation-WebUI (Oobabooga), LM-Studio, Koboldcpp, etc. Since running TTS locally seems to be the topic of several of the videos we find on your channel, this might be something that is close to you anyway...
Best open source library for fine-tuning custom voices? Im currently using alltalktts and the models come out decent, just wondering if there is anything better.
ОтветитьI stupidly though Parler would speak French language but it doesn't seem to...
ОтветитьMy man!!!!!! Fank yoe very moch
ОтветитьIs it possible for two speakers can you help us to find two speakers supported models?
ОтветитьThank you for going over these models! I really enjoyed it!
I have a question about Parler TTS. I want to train in on languages like Arabic that don't use English letters, do you think that could be possible? I tried using Common Voice as an example but failed
Nice German accent. 😂
Ответитьcould you do a video about how to train TTS for our native languages. there are videos but those videos are now old and there are some updates. we would really appreciate if you do for both linux and windows
ОтветитьHi. thank you for your videos. I'm kinda new to this so I don't know much about all this. is there any "good" tts for people that have AMD gpus and are using windows? if there is, can you connect them to something like koboldAI and how?
ОтветитьHow I learn voice cloning and voice accent
ОтветитьI am bigger how i learn ai voice cloning and accent
ОтветитьI wonder which one is better for training a Spanish model. I want to convert books to audio with s better voice than Android. Any guidance?
ОтветитьSo i want to ask about a tool that can extract from a person. Like for example if i want a person with their specific language and they can use their voice. The tool will allow to record the voice first and automatically extract it. Once that happens, that voice can be converted into AI Generated voice on that same voice and accent in just few words.
From this, we can test if we type a few words from text to speech. That specific custom generated AI voice that is extracted will convert the speech to the exact voice and accent itself. Is there a specific tool for that?
Thank you, Man, for this wonderful infermation
Ответитьthanks Thorsten, greetings from buenos aires, argentina
ОтветитьFantastic Thorsten, very useful and informative.
Ответитьthere is any way to train a model voice model on my own voice, after this safe the parameter of my voice safe a file and next time when i need text to speech use only these parameter to generate voice: Coqui-TTS with this model..... help me please. i search all over the internet did not find any solution
ОтветитьWhat is the best option for mac offline?
ОтветитьHallo Thorsten, ich habe dir eine Mail geschrieben, ich würde mich freuen, wenn du guggen könntest 😅. Es geht um dein tolles Programm und ich hab da ein Problem. Keine Angst, ich bin das Problem, nicht dein Programm. 😇 Danke dir.
ОтветитьWhich of these are multi-lingual? in particular those who speak Italian?
ОтветитьWhat one would be able to create a cartoon character voice? I tried a couple of huggingface models but no luck getting a sample voice in to work on building a new voice.
ОтветитьThanks Thorsten! I'm interested in Parler, is there a way to extend the number of characters it can process. My use case is short stories to be converted to audio book. I only know basic python.
ОтветитьHey Thorsten,
Is there any way to use Local AI/Neural TTS in windows with the SAPI5 interface?
I don't want to get an audio output file. I want the TTS to read the text to me using AI/Neural voices.
I would like to use this to read ebooks/text. I already use some Ivona and Harpo voices in Balabolka reader. Very recently I found out I can use Microsoft's online natural voices with NaturalVoiceSAPIAdapter, a very neat piece of software that you're welcome to share with everyone else on your channel. But there is a small problem with that. The reader constantly pauses after every sentence because of how the Adapter is sending the data to Microsoft. So, I am still in desperate need of local neural TTS that works through SAPI5.
Thanks for your great work.
Lieber Thorsten. Als ADSler fällt es mir sehr schwer, lange Texte zu lesen. Ich kann viel besser Informationen verarbeiten, wenn ich sie höre. Die beste Sprachsynthese, wenn es schnell gehen muss, liefert meiner Erfahrung nach leider immer noch ege auf Windows. Aber ich suche regelmäßig nach einer bessern Computerstimme. XTTS war eine deutliche Verbesserung, was Betonung betrifft. Leider wurden manchmal Worte verschluckt. Ich folge deinen Videos aufmerksam und erwarte gespannt deinen Test von Meta Voice. etc. Ich finde deine Arbeit wichtig und bin dir für deine Mühe sehr dankbar. Weiter so.
ОтветитьHallo Thorsten, dein Programm das du hier vorstellst kann leider gar kein deutsch. Aber dafür kannst du ja nichts. Hoffe das es bald bessere models gibt. xtts verschluckt in Version 2.02 leider beim generieren manchmal Wörter oder dichtet welche hinzu. Bisher habe ich kein Weg gefunden das stabil ist. Aber ich werde das weiter beobachten.
Ответитьhey there thorsten i just came across your channel and it so amziang i get the stuffs i was looking for ,these tts model but i have a question iis there a one where he nvidia graphics card is not necessary and it sounds very much human like with easy setup and probably a ui. thank you
ОтветитьIn my opinion if you're looking for the best TTS only then the ChatTTS is the best!
ОтветитьHe thorsten, what is the best overhaul voice cloning ai tool both locally and remotely? RVC, tortoise tts fast, coqui, so-vits, xtts?
ОтветитьNice explain ❤, tts voice clone + run in low end pc?????
Ответитьlink for piper onnx ?
ОтветитьHello Thorsten, is it possible for you to show how to install and use Bark multi-lingual TTS model ?
ОтветитьHi Thorsten, I use TTS with a different intention, my English pronunciation is not good, so I record an audio of myself speaking in English and use it as inference generating an audio with the same sentence.
I currently use CoquiTTS, out of 100 audios that I generate from the same sentence, 7 have a similar intonation and emotion to the original audio 🤣.
Would you have any recommendations for another TTS that can do the same better?
Hi Sir, Your video is Fantastic!!! .. well done!!! The most valuable feature of TTS for me is the ability to highlight words or generate visemes (or even phone numbers) in real time as the text is spoken. This functionality is incredibly important to my work, and I am wondering if any voices or systems provide this capability. Specifically, I am looking for a method to capture spoken words, phrases, or syllables as they are being generated and displayed in real time.
While I have had success with SAPI 5 on Windows for this purpose, I have been unable to find similar solutions for Linux, particularly on my Raspberry Pi setup. My goal is to run me
TTS locally with a childlike voice and to extract key elements such as word highlighting or real-time Phoneme generation. Any guidance or support on achieving these tasks would be greatly appreciated. Thank you!
Hi Thorsten,
How many hours/steps you spent to trains your DE dataset to become usable model in couqi-tts?
I'm trying to do some model training with my dataset (35 minutes of audio) and I start hearing some voice on 10k steps but it is far away from what I would like to get....
Hi Thorsten!
I want to make a portfolio website where people can talk to myself. Id have a text to text that knows everything about me and that would go to a tts of my own voice to tell it what to say each time. My problem is hosting. I dont understand how the APIs of these tts models work and how id be able to host it as most gpu hosting websites offer per hour rates which seem very expensive.. what do i do! maybe ive got the wrong approach..
What's the best TTS for use in an Apple and Android app locally (ie no server connecting)?
ОтветитьIt is a nice and useful video. Thank you. I am looking at various options right now.
Ответитьwhich ones we can use with Swift CoreML ? Is it possible to make them run swift locally?
ОтветитьSadly, none of the reviewed models and frameworks work locally. I have a 2080 Nvidia and tried the frameworks on Umbuntu. All of the frameworks have very poor documentation. Tucan has an issue with the code that is meant to execute finetuning and kept coming up with division by zero errors (I think it has a lower limit on number of samples but not mentioned anywhere in docs). Mars5 needs more than 16GM Vram (but not mentioned anywhere either). ChatTTS does NOT support finetuning, but needs training from scratch. and MetaVoice has a stated 12GB VRam requirement, which meant I did not even try.
Ответитьcan you do a tutorial step by step on how to install locally chatTTS, please?
Ответитьlove this tshirt!
ОтветитьWhicj of them support API option?
Ответить