LLMs are a bad deal when you look at how much power you need to run that inference. A device that could barely run one instance of QwQ-32B at glacial speeds will be able to serve multiple concurrent users of Kiwix.
But--if you don't think of asking Hacker News every single thing you need to know beforehand, I think you still want the LLM to answer questions and help you bootstrap it.