Almost all the GPT answers shown in the thread are subtly incorrect, if not outright false. The brainfuck program is utter nonsense. Conversely, I can expect Google's answers to be passable most of the time.
A major leap in accuracy is possible by allowing it to consult a search engine. Right now it works in "closed-book" mode, there's only so much information you can put in the weights of the net.
I think the main problem is that it doesn't actually have a concept of truth or falsehood—it's just very good at knowing what sounds correct. So, to GPT3, a subtle error is almost as good as being totally right, whereas in practice there's a huge gulf between correct and incorrect. That's a categorical problem, not something that can be patched.
It isn't like google never returns the wrong answer