Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If you're parsing JSON or other serialization formats, I expect that can be nontrivial. Yes, it's dominated by the llm call, but serializing is pure CPU.

Also, ideally your lightweight client logic can run on a small device/server with bounded memory usage. If OpenAI spins up a server for each codex query, the size of that server matters (at scale/cost) so shaving off mb of overhead is worthwhile.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: