I love Advanced Voice Mode except it's so unreliable in my experience. I can't have more than a 5 minutes of conversation without it getting stuck and me having to restart it.
I believe it's partly due to it doing Whisper processing remotely (correct me if I'm wrong), which introduces lag and slows things down.
Gemini does voice processing locally through text-to-speech and voice conversations with Gemini are soo much smoother.
The downside is Gemini does much more poorly than ChatGPT at uncommon words and when I code switch between languages. ChatGPT is just excellent at understanding uncommon words and in different languages.
Advanced voice mode does not do text to speech, one of 4o’s modes is speech, it’s multi-modal. This is how it can understand the emotional content of your speech.
I believe it's partly due to it doing Whisper processing remotely (correct me if I'm wrong), which introduces lag and slows things down.
Gemini does voice processing locally through text-to-speech and voice conversations with Gemini are soo much smoother.
The downside is Gemini does much more poorly than ChatGPT at uncommon words and when I code switch between languages. ChatGPT is just excellent at understanding uncommon words and in different languages.