A couple seconds? Which project does this?

defamation · on Aug 11, 2024

every single voice cloning project

I'm sure you could have used google

https://elevenlabs.io/app/voice-lab

https://app.resemble.ai/users/sign_in

https://github.com/neonbjb/tortoise-tts

https://coqui.ai/blog/tts/open_xtts

it was even possible 5 years ago

https://github.com/CorentinJ/Real-Time-Voice-Cloning

xyst · on Aug 11, 2024

Mission Impossible 3 was only the proof of concept

jazzyjackson · on Aug 11, 2024

how many do you think "a couple" is?

baobabKoodaa · on Aug 11, 2024

Nope. If you actually tried those, you would quickly find out they don't work. It's actually really hard to clone a voice from a few seconds sample.

knowaveragejoe · on Aug 12, 2024

A few seconds, yeah. I've seen fairly convincing reproductions from 30 seconds of reading text though.

grugagag · on Aug 11, 2024

Cloning voice signature or timbre may need a bit more for a good quality. Then there are idiosyncracies in one’s voice. In addition to that, there are tiny verbal tics, expressions, cadence, feel, and some more to be able to say you have properly cloned someone’s voice. The two second sample is like a shallow clone of sorts and is indeed vector space.

BoredPositron · on Aug 11, 2024

The last sentence is hilarious. What do you think "properly" cloned voices are? Not every model is few shot and not every model relies on their training set for paralanguage anymore. Easiest way to try it out is properly the pro voice cloning from elevenlabs.

Izkata · on Aug 11, 2024

"Expressions" in particular is about choice of words. A few seconds definitely isn't enough to duplicate that.

BoredPositron · on Aug 11, 2024

The choice of words is usually yours for a tts model.

tazu · on Aug 11, 2024

There are many models, XTTS [1] is a good one.

[1]: https://coqui.ai/blog/tts/open_xtts

InvertedRhodium · on Aug 11, 2024

I’ve had success with elevenlabs.