Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Presumably text-to-speech systems have signatures. For example how many milliseconds they need to pronounce each syllable of a particular word, say "watermelon". If the timings match for a whole sentence, this machine would probably be able to recognize that... an easy defeat would be a TTS engine that adds random milliseconds to each syllable.


Thus begins the first TTS arms race.

Yes, this is what will define 2017.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: