A few days ago the QwQ-32B model was released, it uses the same kind of reasoning style. So I took one sample and reverse engineered the prompt with Sonnet 3.5. Now I can just paste this prompt into any LLM. It's all about expressing doubt, double checking and backtracking on itself. I am kind of fond of this response style, it seems more genuine and openended.
Interestingly, this prompt breaks o1-mini and o1-preview for me, while 4o works as expected — they immediately jump from "thinking" to "finished thinking" without outputting anything (including thinking steps).
Maybe it breaks some specific syntax required by the original system prompt? Though you'd think OpenAI would know to prevent this with their function calling API and all, so it might just be triggering some anti-abuse mechanism without going so far as to give a warning.
I tried this with LeChat (mistral) and ChatGPT 3.5 (free) and they start to respond to "something" following the style but... without any question asked.
Weirdly enough, a few minutes ago I was using o1 via ChatGPT and it started consistently repeating its complete chain of thought back to me for every question I asked, with a 1-1 mapping to the little “thought process” summaries ChatGPT provides for o1’s answers. My system prompt does say something to the effect of “explain your reasoning”, but my understanding was that the model was trained to never output those details even when requested.