The japanese version will be pretty terrible, I'd imagine. Too much homophones and ambiguous wording. "sounandayo!" = そうなんだよ "that's right!" = 遭難だよ "it's an accident!"
The pitch on those are quite different and there is obviously context to rely on too. I mean whether the software can do all that is a question, but something like Siri does OK with Japanese speech recognition (at least not a lot worse than other languages from what I can tell).