Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The sensible way to speed up compilation 5x was implemented almost 10 years ago, worked amazingly well, and was completely ignored. I don't expect progress from the standards committees. Here it is if you're interested: https://github.com/yrnkrn/zapcc

The next major advance to be completely ignored by standards committees will be the 100% memory safe C/C++ compiler, which is also implemented and works amazingly well: https://github.com/pizlonator/fil-c



> The sensible way to speed up compilation 5x was implemented almost 10 years ago, worked amazingly well, and was completely ignored. I don't expect progress from the standards committees. Here it is if you're interested: https://github.com/yrnkrn/zapcc

Tools like ccache have been around for over two decades, and all you need to do to onboard them is to install the executable and set an environment flag.

What value do you think something like zapcc brings that tools like ccache haven't been providing already?

https://en.wikipedia.org/wiki/Ccache


> What value do you think something like zapcc brings that tools like ccache haven't been providing already?

It avoid instantiating the same templates over and over in every translation unit, instead caching the first instantiation of each. ccache doesn't do this: it only caches complete object files, but does not avoid repeated instantiation costs in each object file.


> It avoid instantiating the same templates over and over in every translation unit, instead caching the first instantiation of each. ccache doesn't do this: it only caches complete object files, but does not avoid repeated instantiation costs in each object file.

I'm afraid this feature is at best a very minor improvement that hardly justifies migrating a whole compiler. To be blunt, it's not even addressing a problem that exists or makes sense to even think about. I will explain to you why.

I've been using ccache for years and I never had any problem getting ccache to support template code. Why? Because the concept of templates is ortogonal to compiler caches. It matters nothing, if you understand how compiler caches work. Think about it. You have the source file you are compiling, you have the set of build flags passed to the compiler, and you have the resulting binary.

That's the whole input, and output.

It's irrelevant if the code features templates or not.

Have you checked if the likes of zapcc is fixing a problem that actually doesn't exist?


> I'm afraid this feature is at best a very minor improvement that hardly justifies migrating a whole compiler.

Here are some performance numbers: https://www.phoronix.com/news/Zapcc-Quick-Benchmarks

> To be blunt, it's not even addressing a problem that exists or makes sense to even think about. I will explain to you why.

Do you talk down to people like this IRL as well?

> I've been using ccache for years and I never had any problem getting ccache to support template code.

What I said is that zapcc has a different approach that offers even more performance benefits, answering the question of what zapcc offers that ccache doesn't offer.

> if you understand how compiler caches work. Think about it.

There's no need to use "condescending asshole" as your primary mode of communication, especially when you are wrong, such as in this case.


> Here are some performance numbers: https://www.phoronix.com/news/Zapcc-Quick-Benchmarks

If you look at the benchmarks you just quoted, you see cache-based compilations outperforming zapcc in quite a few tests. I wonder why you missed that.

The ones that ccache fares as well as builds that don't employ caching at all are telling. Either ccache was somehow not used, or there was a critical configuration issue that prevented ccache from caching anything. This typically happens when projects employ other optimization strategies that mess with ccache, such as pipelining builds being enabled or extensive use of precompiled headers.

The good news is that in both cases these issues are fixed by either by actually configuring ccache or disabling these other conflicting optimization strategies. To be able to tell, it would be necessary to troubleshooting the build and take a look at ccache logs.

> Do you talk down to people like this IRL as well?

Your need to resort to personal attacks is not cool. What do you hope to achieve, other than not sounding like an adult?

And do you believe that pointing out critical design flaws is "talking down to people"?

My point is very clear: your baseline compiler cache system, something that exists for two decades, already supports caching template code. How? Because it was never an issue to begin with. I explained why: because a compiler cache fundamentally caches the resulting binary given a cache key, which is comprised of data such as the source file provided as input (basically the state of the translation unit) and the set of compiler flags used to compile it. What features in the translation unit is immaterial. It doesn't matter.

Do you understand why caching template code is a problem that effectively never existed?

> What I said is that zapcc has a different approach that offers even more performance benefits, answering the question of what zapcc offers that ccache doesn't offer.

It's perfectly fine if you personally have a desire to explore whatever idea springs to mind. There is no harm in that.

If you are presenting said pet project as any kind of solution, the very least that is required of you is to review the problem space, and also perform a honest review of the solution space. You might very well discover that your problem effectively does not exist, because some of your key assumptions do not hold.

I repeat: with pretty basic compiler caches, such as ccache which exists for over two decades, the only thing you need to do to be able to cache template code is to install ccache and set a flag in your build system. Tools such as cmake already support it out of the box, so onboarding work is negligible. Benchmarks already show builds with ccache outperforming builds with the likes of zapcc. What does this tell you?


It isn't only about template code. zapcc speeds up compilation when there's no cache. I tried it 10 years ago and it really reduced build times from minutes to seconds. For full builds.


> It isn't only about template code. zapcc speeds up compilation when there's no cache.

Someone else in this thread already pasted benchmarks. The observation was, and I quote:

> Zapcc focuses on super fast compile times albeit the speed of the generated code tends to be comparable with Clang itself, at least based upon last figures.


ccache works at a translation unit level which means it isn't any better than just make-style incremental rebuilds when you aren't throwing away the build directory - it still needs to rebuild the whole translation unit from scratch if a single line in some header changes.


> ccache works at a translation unit level which means it isn't any better than just make-style incremental rebuilds when you aren't throwing away the build directory

You used many words just so say "ccache is a build cache".

> it still needs to rebuild the whole translation unit from scratch if a single line in some header changes.

You are using many words to say "ccache rebuilds a translation unit when it changes".

What point were you trying to make?


Zapcc speeds up incremental builds of a single translation unit. And it speeds up clean-cache builds that include the same headers in multiple translation units. Ccache does not. Yes, this is a huge advantage in real situations.

Frankly your attitude in this whole thread has been very condescending. Being condescending and also not understanding what you're talking about is a really bad combination. Reconsider whether your commenting style is consistent with the HN guidelines, please.


From my perspective there is a pretty big difference between a persistent compiler daemon and a simple cache that constantly restarts the compiler over and over again.


what is this sorcery. I was reading HN for years, this is the first time I see someone brings up a memory safe C++. how is that not even on the headlines ? what's the catch, build times ? do I have to sell my house to get it?

EDIT: Oh, found the tradeoff:

hollerith on Feb 21, 2024 | prev | next [–]

>Fil-C is currently about 200x slower than legacy C according to my tests


The catch is performance. It's not 200x slower though! 2x-4x is the actual range you can expect. There are many applications where that could be an acceptable tradeoff for achieving absolute memory safety of unmodified C/C++ code.

But also consider that it's one guy's side project! If it was standardized and widely adopted I'm certain the performance penalty could be reduced with more effort on the implementation. And I'm also sure that for new C/C++ code that's aware of the performance characteristics of Fil-C that we could come up with ways to mitigate performance issues.


The design choices that make it slower can’t be mitigated either in theory or practice. C++ competes against other languages that make virtually identical tradeoffs that are far more mature and which have never closed that performance gap in a meaningful way.

For the high-end performance-engineered cases that C++ is famously used for, the performance loss may be understated since it actively interferes with standard performance-engineer techniques.

It may have a role in boring utilities and such but those are legacy roles for C++. That might be a good use case! Most new C++ code is used in applications where something like Fil-C would be an unacceptable tradeoff.


It won't ever be 1x the runtime, but if you think better than 2x is an insurmountable challenge then we will have to agree to disagree on that one. And for high-end performance engineering, I expect code that is aware of Fil-C could do better than naive code simply compiled with Fil-C.


I mean, if I could accept a 2x-4x performance hit, then I wouldn't be using C++ in the first place. At that point, there are any of a number of other languages that are miles more pleasant to program in.


There are a whole lot of C/C++ libraries and utilities that would be great to use in a memory safe context without rewriting them. And it's not exactly easy to reach 2x C's runtime in most other languages. But again, I think that performance penalty could be significantly reduced with more effort on the implementation.


For new code, sure. But there is plenty of existing C code that isn’t going to be rewritten and isn’t that performance sensitive.


Latest version of Fil-C with -O1 is just around 50-100% slower than ASAN, very acceptable in my book. I'm actually more "bothered" by its compilation time (took roughly 8x time of clang with ASAN).


I’ve found Fil-C 0.670 to be around 30x slower than a regular build, with -O1 vs. -O2 not making much difference. Perhaps it is very dependent on the kind of code. IIRC the author of Fil-C (which I think is an incredible project, to be clear) wants it to be possible to run Fil-C builds in production, so I think the comparison to a regular non-ASAN build is relevant.


For debugging, sure. Address sanitizer is itself pretty slow.


> The next major advance to be completely ignored by standards committees will be the 100% memory safe C/C++ compiler, which is also implemented and works amazingly well: https://github.com/pizlonator/fil-c

> It's not even possible to link to unsafe code.

This makes it rather theoretical.


You’d be surprised! “Lots of software packages work in Fil-C with zero or minimal changes, including big ones like openssl, CPython, SQLite, and many others.”

It wraps all of the typical API surface used by Linux code.

I’m told it has found real bugs in well-known packages as well, as it will trap on unsafe but otherwise benign accesses (like reading past one the end of a stack buffer).


> The sensible way to speed up compilation 5x was implemented almost 10 years ago, worked amazingly well, and was completely ignored. I don't expect progress from the standards committees. Here it is if you're interested: https://github.com/yrnkrn/zapcc

Of course it was completely ignored. Did you expect the standards committee to enforce caching in compilers? That's just not its job.

> The next major advance to be completely ignored by standards committees will be the 100% memory safe C/C++ compiler, which is also implemented and works amazingly well: https://github.com/pizlonator/fil-c

Again—do you expect the standards committee to enforce usage of this compiler or what? The standards commitee doesn't "standardize" compilers...


Both zapcc and Fil-C could benefit from the involvement of the standards committee. While both are very compatible, there are certain things that they can't fully support and it would be useful to standardize small language changes for their benefit (and for the benefit of other implementations of the same ideas). Certainly more useful than anything else the standards committees have done in the past 10 years. They would also benefit from the increased exposure that standardization would bring, and the languages would benefit from actual solutions to the problems of security and compile time that C/C++ developers face every day.


The standards committee's job is not to develop products like software packages for free. It is to ensure that the language meets the requirements of stakeholders, many of whom develop competing commercial products.

Of course, these tools are of interest to the broader C++ community. Thanks for sharing.


I'm not asking the standards committees to develop them for free. They were already developed! I'm saying that the committees should acknowledge their existence, and the fact that they solve some of C/C++'s biggest problems, in some ways better than the committee-blessed solutions. They deserve attention from the committees to direct language evolution in ways that support them better and encourage alternative implementations.


I was referring to this:

>Both zapcc and Fil-C could benefit from the involvement of the standards committee.

What exactly does the standards committee do for these software projects without being involved in their development? I think there is nothing to do here that is within the scope of the language itself. Of course, if the creators of those projects come up with a cool new idea, they can submit to the standards committee for comment. They can also comment on new standards that make the tools not work anymore. But that is help going from the project to the committee, not the other way around.


> 'm not asking the standards committees to develop them for free. They were already developed!

That's great to hear. It sounds like you have everything set to put together a proposal. Do you have any timeline in mind to present something?

> I'm saying that the committees should acknowledge their existence, (...)

Oh does this mean any of the tools you're praising was already proposed to be included in the standard? Do you mind linking to the draft proposal? It's a mailing list, and all it takes is a single email, so it should be easy to link.

Where's the link?

https://isocpp.org/std/submit-a-proposal


> Both zapcc and Fil-C could benefit from the involvement of the standards committee.

I think there is a hefty deal of ignorance in your comment. A standardization process is not pull-based, it's push-based.

If you feel you have a nice idea that has technical legs to stand, you write your idea down and put together a proposal and then get in touch with committee members to present it.

The process is pretty open.

> Certainly more useful than anything else the standards committees have done in the past 10 years.

Do you understand the "standards committee" is comprised of people like you and me, except they got off their rear-end and actually contribute to it? You make it sound like they are a robe-wearing secret society that is secluded from the world.

Seriously, spend a few minutes getting acquainted with the process, what it takes to become a member, and what you need to do to propose something.


FWIW tooling being as important as it is, it always seems like a mistake to me that the standard committee doesn't standardize compilers.


> Of course it was completely ignored. Did you expect the standards committee to enforce caching in compilers? That's just not its job.

There are also quite a few compiler cache systems around.

For example, anyone can onboard tools like ccache by installing it and setting an environment variable.


Why should anyone use zapcc instead of ccache? It certainly sounds expensive to save all compiler's internal data, if that is what it does.

I'm sure you must be aware, these compiler tools do not constitute a language innovation. I'd also imagine that both are not productions ready in any sense, and would be very difficult to debug if they were not working correctly.


Zapcc can, and frequently does, speed up the compilation of single files by 20x or more in the incremental case. CCache can't do that. And it's by far the common case when compiling iteratively during development. A speedup that large is transformational to your development workflow.


The readme does a really bad job of explaining this. They are also using the wrong metrics. What’s really interesting is time to incrementally recompile after changing one header file in $LARGE_PROJECT. Single file compilation sounds like an artificial measurement and isn’t exactly painful with normal clang today. Full build measurements are rather irrelevant in a project with proper incremental setup, or even ccache. Not that I would say no to faster full builds :)


ccache speeds up compilation of single files by quite a lot, by effectively avoiding unnecessary recompilation. There are distributed caches that work like ccache too. Compiling a file for a second or fiftieth time with no changes because of doing a clean build is the absolute most common case. Maybe zapcc does additional caching of compiler internal state, but I would have to look into it to see if it's actually good enough. Also, ccache works with many compilers, whereas zapcc is its own compiler. That is a huge advantage.


ccache does not speed up compilation at all, in fact it slows it down. It only speeds up re-compilation of the same translation unit (as in, bitwise identical preprocessed source code) which is often not all that useful, especially when the local development rebuild use case is already covered by make.


ccache hardly slows anything down. It is a thin wrapper around running the compiler. You seem to kind of understand how it works, but it has multiple configurable ways to detect whether a file should be compiled. It does a LOT more than make, which does NOTHING to handle clean rebuilds or multiple builds of similar code in different locations. Unlike make, it does not rely on file timestamps alone to decide whether to rebuild an output.


I have a cpp file that takes 10 seconds to compile. I change one line. How does ccache help me in this case?


When you do a clean build, it will pull that output from the cache if it has been compiled before and has not changed since. It does not technically speed up compilation of files. It bypasses compilation when it can.


ccache only caches the individual object files produced by compiling a .cpp file.

C++ build times are actually dominated by redundant parsing of headers included in multiple .cpp files. And also redundant template instantiations in different files. This redundancy still exists when using ccache.

By caching the individual language constructs, you eliminate the redundancy entirely.


Why should anyone eat apples instead of oranges?

ccache doesn't add anything over make for a single project build.


Yes, it does a LOT more than make. You can set it up to cache files for every copy of a project in your workspace. So if you kick off 10 related builds, for example, only the files that differ will be compiled twice. Although I guess it depends on what you mean by a "single project build". Even in the case of one copy, a clean rebuild will use the cache to speed things up.


zapcc looks really promising! Unfortunately it seems to be unmaintained: haven't seen any commits in 5 years.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: