Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My codebase is significantly larger than yours (mine's a mix of mostly-C++ & some C) — perhaps 10–12 million lines. Clean builds are ~10m; clean-with-ccache are ~2m; incremental are millisecond.

I know this probably won't help with your current project, but you should think of your compiler as an exotic virtual machine: your code is the input program, and output executable is the output. Just like with a "real" CPU, there are ways to write a program that are fast, and ways to write a program that are slow.

To continue the analogy: if you have to sort a list, use `qsort()`, not `bubble sort()`.

So, for C/++ we can order the "cost" of various language features, from most-expensive-to-least-expensive:

    1. Deeply nested header-only (templated/inline) "libraries";
    2. Function overloading (especially with templates);
    3. Classes;
    4. Functions & type definitions; and,
    5. Macros & data.
That means, if you were to look at my code-base, you'll see lots-and-lots of "table driven" code, where I've encoded huge swathes of business logic as structured arrays of integers, and even more as macros-that-make-such-tables. This code compiles at ~100kloc/s.

We don't use function-overloading: one place we removed this reduced compile times from 70 hours to 20 seconds. Function-overloading requires the compiler to walk a list of functions, perform ADL, and then decide which is best. Functions that are "just C like" require a hash-lookup. The difference is about a factor of 10000 in speed. You can do "pretend" function-overloading by using a template + a switch statement, and letting template instantiation sort things out for you.

The last thing is we pretty much never allow "project" header files to include each other. More importantly, templated types must be instantiated once in one C++, and then `extern`ed. This is all the benefit of a template (write-once, reuse), with none of the holy-crap-we're-parsing-this-again issues.



I love your comment and it is 100% spot-on. extern C is the magic sauce for making anything fast.

The only downside is that it adds a ton of boilerplate and a lot of maintenance overhead. You need separate compilation units for everything and then you need a sub-struct to use the pimpl approach. Fast pimpl (in-place new in reserved space in the parent struct itself) gets rid of the heap allocations but you still have a pointer indirection and prevent the compiler from properly stripping out unused code across translation units normally (that’s where LTO comes in these days).

Really, the problem is just that it’s a PITA to write compared to sticking everything in the header file.

(It’s ironic that rust meets the first two rules by design but is still much slower than C++ to compile, though it does imply what’s already known, specifically that there’s a lot of room for improvement.)




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: