Add more humans and LLMs to correct for errors. If humans sometimes go crazy and try to randomly end the world at a rate of 0.1%, requiring two humans to turn two keys synchronously to end the world reduces the error rate to 0.01%.
So, to avoid depressed AIs ending the world randomly, have a stable of multiple AIs with different provenance (one from Anthropic, one from OpenAI, one from Google...) require a majority agreement to reduce the error rate. Adjust thresholds depending on criticality of the task at hand.
So, to avoid depressed AIs ending the world randomly, have a stable of multiple AIs with different provenance (one from Anthropic, one from OpenAI, one from Google...) require a majority agreement to reduce the error rate. Adjust thresholds depending on criticality of the task at hand.