The Cost of Exception Handling

mxmlnkn · on Nov 12, 2022

It is interesting to read how the compiler behaves when using instructions but the number of generated instructions is only one metric. As always, there is nothing better than measuring the runtime of your actual implementation.

I'm writing this because I had a tight-running loop that returned a bool and in 1 out of 100k or so cases, it would return false. It turns out that instead of checking that bool in each loop iteration, it was actually significantly faster to check nothing and throw an exception in that very rare case. Something like this:

  try {
      while ( true ) {
          foo.read();
      }
  } catch ( const std::exception& ) {
      // foo.read throws on eof
  }

instead of this:

  while ( !foo.eof() ) {
      foo.read();
  }

This might not be the most C++-like code but it saved me like 50% of runtime in one case. Afaik as long as exceptions are not thrown, there is zero overhead, which would explain this. And, the call to eof might have been too complex and long-running, maybe I could have optimized that one instead. But there are abstractions that might hinder such optimizations.

throwaway9870 · on Nov 12, 2022

You are making two functions calls in the second loop, vs one in the first. If the result of read() could be checked for eof, then you would remove one call. Also, since the branch is so easily predicted, I would be surprised if there is much of a performance hit once the second call is removed.

If you see a 50% speed-up in the above code, I question what read is doing and the implementation of eof(). I would expect read() to dominate the if statement unless eof is extremely expensive. I have written dozens of read()/eof() functions over the years and read() always dominates because the read involves copying memory and/or making OS calls, etc., while eof() is a simple comparison. So if read() dominates runtime, example 1 and 2 above should not be anywhere close to 50% different.

mxmlnkn · on Nov 12, 2022

That particular code came from a bit reader class. So, if you are reading only a few bits per call, then the read call becomes very cheap. It might be as "simple" as this pseudocode

  if ( offset + requestedBitCount < 64 ) {
      offset += requestedBitCount;
      return ( bitBuffer >> offset ) & nLowestBitsSet( requestedBitCount );
  }

And yes, I think the eof call might have been too complex especially in contrast to the short read call. It probably should only check a flag that might be set by the read call automatically.

throwaway9870 · on Nov 12, 2022

I still don't get it. You need to put an if statement in that function to raise the exception. So I still don't see where 50% comes from.

mxmlnkn · on Nov 13, 2022

At the very least you are redundantly executing logic without the exception. The check for eof has to be done implicitly anyway inside read because it has to fill the bit buffer with data from the byte buffer or the byte buffer with data from the file. And if both fail, then we already know the result of eof, so no need to duplicate checking for eof in the outer read calling loop.

Here is the full commit with ad-hoc benchmark results in the commit message:

https://github.com/mxmlnkn/pragzip/commit/0b1af498377838c30f...

and here the benchmarks I ran at that time:

https://github.com/mxmlnkn/pragzip/blob/0b1af498377838c30fea...

In this commit, I even replace std::optional with an exception. But, I didn't document benchmark results and the commit message begins with "try" so they probably weren't significantly better but also not worse or else I wouldn't commit it:

https://github.com/mxmlnkn/pragzip/commit/0f1babaf3b9cbe2912...

At the end of that file are infrequently updated results of the benchmarks.

As you can see, it's part of my random-seekable multi-threaded gzip and bzip2 parallel decompression libraries.

jnordwick · on Nov 13, 2022

Wihtout seeing the assembly it is difficult to say but for such a large difference, the culprit often seems to the loop iop cache. If you check the alignment of the top of the loop instructions (and it inlined away the calls) sometimes you can see an alignment change for some trivial bullshit that has vasat performance impacts because it causes a series of other missed optmizations.

That damn loop cache is great for performance when you hit it,but sucks horrible when you don't even think out it and spend days trying to figure out some performance quandry.

rightbyte · on Nov 12, 2022

In the exception code a foo.eof() check need to be in foo.read().

So I guess the runtime in theory should be the same and the implementation happened to be slow. Maybe eol() did some syscall each time instead of caching a EOF int like in C.

TheAceOfHearts · on Nov 12, 2022

An underrated topic which I haven't seen discussed much is what you should actually do in the real world when you encounter certain exceptions. Too many examples just kinda gloss over that by saying "oh you should handle any errors or exceptions here".

If my memory allocation fails or if some file-related operation fails, what are the chances that I can actually do anything about it? Sometimes the best thing to do is just let it crash and log the event.

It would be great to see a book about common strategies for handling exceptions and other forms of failure in real world applications. I'd imagine that certain strategies for handling exceptions would also require you to write your application in such a way as to make it possible to recover from failures as well.

hyperman1 · on Nov 12, 2022

If something goes wrong, there is not that much you can do. AFAIK the options are 1) retry 2) fall back to something else 3) give up and complain to someone. The options are simple, even if none of them are particularly nice.

There are a lot of variations on these, of course: log it, and to where? keep unfinished work around somewhere or crash and burn? what if even logging the failure fails?

fjdiccf · on Nov 12, 2022

A guy I once worked with had a really good approach to exceptions: don't.

Basically if your code should throw, instead of throwing. Do something about it. Rollback the filesystem or transaction so things can retry or the process can exit.

Ideally, do not rely on someone reading a log message.

Instead, do what that person reading the log would need to do.

If that's an escalation to support staff or the CEO. Send the email. If the developer reading the log would just press a retry button, press the button for them.

Basically, giving up was just not an option, it worked pretty well tbh, sure we had logs and they were crucial for debugging failures but weekend callouts were not a thing. We never had to babysit processes. Never had to worry about filesystem corruption if you want to kill -9 a process mid run.

jrochkind1 · on Nov 12, 2022

Sure that works for application code; for library code, you are raising an exception so that the application code can decide the domain-specific thing to do that the person reading the log message would do. Exceptions are raised by code that can be used in multiple contexts, where the person reading the log message would be doing entirely different things.

hyperman1 · on Nov 12, 2022

I agree with part of this. When possible, code should either succeed or revert back to the state before the operation, before doing a throw. Avoid half-completed work if at all possible. This is in general quite easy by first doing work with higher chance on failure, and only then connecting that work to the rest of the system state. E.g. don't add an item to a list and then do something with it, do it the other way round and only add items to the list when something has already been done on them.

If you are in a known failure scenario, you can try known resolutions. If some necessary resource like a DB or file system disappeared, try to self-heal as soon as the resource reappears. But if you're in unknown territory, better stop working and complain. Writing a message in a log is good enough, IF some monitoring system will pick up that log and alert the right channel.

I've seen systems trying to self heal, but working on wrong assumptions. Operators now have 2 problems: Fix the system AND stop it from damaging itself. Self-healing can turn to self-damaging quickly.

jandrewrogers · on Nov 12, 2022

I’ve written a few systems that were aggressively self-healing and operated them in production. The benefit is as you say. When done well, the systems kind of run themselves and require much less attention than systems that are not designed this way. From an operations perspective it was great. From a software development perspective, not so much, and this largely explains why it is uncommon.

In all typical software architectures, many places in the code do not have enough context to handle exceptional conditions. Single errors may have multiple possible root causes that have to be determined by inference or deduction in the code so that the handling is appropriate. Evaluating some causes requires complex code far outside the purview of the software’s main purpose and possibly skill set of the developers. Appropriate resolution of an exception at a single call site can be context dependent — not only do you have to determine the root cause at runtime, you also have to determine the correct resolution at runtime. A single resolution may need to implement multiple strategies to take into account real-time environmental context that change how that resolution is handled.

Making this logic maintainable requires an architecture that pretty heavily revolves around the software infrastructure required to make this type of exception handling scalable. You’re replacing all of the error handling idioms every software engineer knows with something alien that colors the entire code base. Also, there is little in the way of robust frameworks that do a lot of this grunt work for you so you are usually left writing your own.

The tl;dr is that implementation is quite expensive and difficult in practice, even though it usually has no performance overhead and is great from an operations perspective. While people like the idea, the software development overhead is usually considered too high to justify making operations’ life easier.

pjc50 · on Nov 12, 2022

I've certainly had success with an application with a very aggressive transactional approach; every button press would commit-or-crash, and in the event that it did crash you could simply resume from another terminal by logging in with a quick key fob tap.

amelius · on Nov 12, 2022

> Do something about it. Rollback the filesystem or transaction so things can retry or the process can exit.

So when the system appears to be hanging, it might actually be trying to rollback and retry? When should it give up?

Sounds like a recipe for debug-hell. Better fail and write a log message.

marcosdumay · on Nov 12, 2022

The MS Office clippy says "Hello, looks like you are having a problem".

hyperman1 · on Nov 12, 2022

Exceptions only make sense if there are failures to handle. If you don't use exceptions for failure, something else like an 'if not ok goto handle' needs to be in there. This is what C does explicitly, and rust implicitly if you look trough Result.

So a better comparison would be: let construct() return a bool, an do manual error handling if it returns false. I'm pretty sure, the compiler will generate much worse code in this case.

pjc50 · on Nov 12, 2022

To a great extent C++ exceptions exist and are the way they are because you can't return an error from a constructor.

masklinn · on Nov 12, 2022

On the other hand, constructors don’t allow returning an error because C++ has exceptions.

chrisseaton · on Nov 12, 2022

> constructors don’t allow returning an error because C++ has exceptions

Didn't constructors (1979) come before exceptions (1986)? So no that's not the case.

masklinn · on Nov 12, 2022

Yes, I didn't originally know that (I thought constructors and exceptions would have been co-implemented and didn't consider that faillible ctors would have been outright impossible for a decade), and found out after pjc50's reply: https://news.ycombinator.com/item?id=33572348

chrisseaton · on Nov 12, 2022

People used to use a flag in objects to say 'invalid because constructor failed', or make the constructor infallible, and have a separated initialise method which could return an error code.

pjc50 · on Nov 12, 2022

I mean .. yes, but easy multiple-value-return unpacking is also a fairly new feature. Back in the 90s you had one (1) lvalue. Making the "return (thing, error)" pattern miserable to use.

masklinn · on Nov 12, 2022

I was going to say that such an alternative history C++ could have had MRVs earlier, or even sum types... but apparently exceptions were added a decade after constructors (if we draw that back to the original "C with classes").

So yeah, you're right that constructors led to exceptions, unless the designers of C++ had been willing to go back to the drawing board and entirely redo constructors.

azov · on Nov 12, 2022

Indeed. It doesn't make sense to compare code designed to handle errors with code that simply ignores them - yet I see it all the time when people talk about C++ exception handling.

amelius · on Nov 12, 2022

The author has quite an interesting homepage: http://c3d.github.io/

He developed several research programming languages, and he wrote a book on the possible unification of QM and relativity.

dataflow · on Nov 12, 2022

Is this refuting a claim that is actually made? Do people claim exceptions never have overhead on the happy path in C++?

As an analogy, forget exceptions for a moment—just imagine plain function calls in C++ vs. Python. In Python, function calls always have overhead. In C++, though, we say function calls don't add overhead, because function calls can be inlined. But when we make that claim, we obviously don't mean function calls never have overhead in C++—obviously, the statement can't apply if you can't inline the call. Rather, the meaning of the claim is that function calls don't inherently introduce overhead—i.e. if they do, it's because of some other fact (like inability to inline), not the mere existence of the function call itself.

The case seems similar to me here. When people say exceptions don't slow down the happy path, they don't mean this is impossible. They just mean that if this does occur, it's due to additional inherent obstacles beyond the mere enablement of exceptions (e.g., opaque function calls).

pjc50 · on Nov 12, 2022

"Zero-cost exceptions aren’t actually zero cost" : https://devblogs.microsoft.com/oldnewthing/20220228-00/?p=10...

sanxiyn · on Nov 12, 2022

A great post. Note that the post is specific to C++ and does not apply to, say, Java. Java is effectively always paying the cost of exception handling, so you might as well use it.

moonchild · on Nov 12, 2022

A better way to put it: java doesn't have destructors—mostly because it has proper gc, so it doesn't have to spend time piddling around and freeing objects—so you never have to pay most of the costs that c++ pays.

cesarb · on Nov 12, 2022

Java does have destructors, but they're called finalizers.

pencilguin · on Nov 12, 2022

Or, rather, pays literally all the time, exceptions or no.

lokedhs · on Nov 12, 2022

It doesn't really.

Freeing heap objects in C++ is an expensive operation. You may want to look into how that is implemented in the common allocators.

In Java, an allocation is literally a single instruction (adding a value to the heap pointer), and freeing is zero cost (since you just drop the pointer after you're done with it).

By contrast, in C++ both allocation and freeing of heap memory are costly operations (which is there is so much emphasis on stack allocation).

As it turns out, the time taken by the garbage collector in Java (which doesn't exist in C++) is still going to be less than what is spent on managing the same number of heap objects in C++.

The only reason why C++ can be faster than the Java memory management is thanks to all the stack allocation.

cesarb · on Nov 12, 2022

> In Java, an allocation is literally a single instruction (adding a value to the heap pointer), and freeing is zero cost (since you just drop the pointer after you're done with it).

There are other hidden costs in Java GC allocation. While most of the time, an allocation will be just two instructions (it cannot just increment a heap pointer, it also has to check whether there's enough free space and jump to a slow path if there isn't, which needs a second instruction), there is also some overhead in the GC tracking of modified references (card marking), either a short pause in each thread while the GC reads its registers (to get any pointers stored there) or extra overhead of having to store references in the stack instead of just in registers, and probably some missed optimization opportunities due to having to do things in a order the GC expects.

The garbage collector itself has some annoying overheads: it can waste cache and memory bandwidth by scanning cold objects, it can cause unnecessary swapping in of swapped out pages, and most importantly, it uses more memory. In fact, that's the main trick of GC: it trades off allocation/deallocation speed and memory usage.

> The only reason why C++ can be faster than the Java memory management is thanks to all the stack allocation.

No, the main reason C++ can be faster than the Java memory management is that it has value objects, which can avoid a lot of heap allocation (and pointer chasing) even without replacing it with stack allocation. Consider for instance an array of objects, in C++ it can be a single allocation, while Java requires one allocation for each object, plus another allocation for the array itself. Which can also be considered a hidden cost of Java GC allocation: to be able to track objects correctly, the GC requires that references to an object point to the beginning of its allocation, so an object cannot be inlined within another and still be tracked correctly by the Java GC.

hinkley · on Nov 12, 2022

By Java 5 the average cost of an allocation in Java was already lower than heap allocation in C++, but greater than the cost of stack allocation. With escape analysis in Java 8 that got even better.

The biggest problem Java had was that 90’s era thinking on GC algorithms (most GC languages in the 90’s were using 80’s era GC algorithms) ran out of steam sometime around 2005, when memory availability created situations where GC pause times were impossible to hide. Cliff Click created a JVM that actually ran as a VM, in order to create a cheaper mark phase via page faulting to implement mark on write. Now we have other options.

hayley-patton · on Nov 12, 2022

Pauseless [0]? It originally used hardware support on Azul machines to trap upon reading an unmarked pointer, but now (in the C4 generational incarnation, as well as in ZGC) this is done in software. But the other collectors use snapshot-at-the-beginning and a write barrier for concurrent marking, which originated in 1990 [1].

[0] https://www.usenix.net/legacy/events/vee05/full_papers/p46-c...

[1] https://core.ac.uk/download/pdf/39218501.pdf

hayley-patton · on Nov 13, 2022

And, if one wants to be picky, Pauseless is a variation of Henry Baker's "real-time" copying collector from 1978 [2], although Pauseless uses tags corresponding to various interesting states (marked/unmarked, forwarded/unforwarded) rather than just from-space/to-space.

[2] https://plover.com/~mjd/misc/hbaker-archive/RealTimeGC.html

saurik · on Nov 12, 2022

So, I went to replicate the bottom case, and particularly want to do it more "fairly" by actually isolating the resulting code rather than staring at a ton of complex intermediate assembly :/. I'm thereby building a real binary--after carefully separating the compilation stages using -c and avoiding any linker optimization--instead of -S and then using objdump -d and then--because this is the whole point of talking about the happy path--I'm only providing the code of the actual functions and not any subsequent exception landing pads.

Without exceptions:

    0000000000401110 <main>:
      401110:       50                      push   %rax
      401111:       e8 2a 00 00 00          callq  401140 <construct>
      401116:       e8 25 00 00 00          callq  401140 <construct>
      40111b:       e8 20 00 00 00          callq  401140 <construct>
      401120:       e8 2b 00 00 00          callq  401150 <destruct>
      401125:       e8 26 00 00 00          callq  401150 <destruct>
      40112a:       e8 21 00 00 00          callq  401150 <destruct>
      40112f:       b8 08 00 00 00          mov    $0x8,%eax
      401134:       59                      pop    %rcx
      401135:       c3                      retq

With exceptions:

    0000000000401160 <main>:
      401160:       50                      push   %rax
      401161:       48 89 e7                mov    %rsp,%rdi
      401164:       e8 37 00 00 00          callq  4011a0 <A<5u, void>::A()>
      401169:       e8 a2 00 00 00          callq  401210 <destruct>
      40116e:       e8 9d 00 00 00          callq  401210 <destruct>
      401173:       e8 98 00 00 00          callq  401210 <destruct>
      401178:       b8 08 00 00 00          mov    $0x8,%eax
      40117d:       59                      pop    %rcx
      40117e:       c3                      retq
    
    00000000004011a0 <A<5u, void>::A()>:
      4011a0:       53                      push   %rbx
      4011a1:       e8 5a 00 00 00          callq  401200 <construct>
      4011a6:       e8 55 00 00 00          callq  401200 <construct>
      4011ab:       e8 50 00 00 00          callq  401200 <construct>
      4011b0:       5b                      pop    %rbx
      4011b1:       c3                      retq

Now, first off, I hope people can see that this is a much more fair comparison. I am not sure why the author of this article wanted to use -S, but it seems more than a bit disingenuous to be complaining about how "tidy" the happy path is after pasting a million assembler directives at us as if those actually have a runtime performance cost instead of actually looking at the resulting execution paths through the assembly that results.

(And I want to be very clear here: this isn't because I somehow got a different result. If you carefully read through the assembly presented in the article and mentally skip all the assembler directives, you will see they got something similar, with a call to the constructor followed by three calls to destruct immediately one after the other. That said, I am not sure if the author actually understood what they were looking at, as they didn't provide the code for __ZN1AILj5EvEC2Ev, which is clearly part of the happy path.)

But... yeah: it added the cost of a single extra call indirection for some reason. I'm not entirely sure why, but this frankly doesn't feel like a ridiculous cost. I will also note that it actually generates identical code to the case without exceptions if you merely add a single "inline" keyword to the code on the constructor for A (note: I mean the one on line 9 that is ostensibly empty, not the one on line 16). That said, you also probably wouldn't actually WRITE such a no-op constructor in the first place, and it turns out deleting it also fixes the resulting code to be exactly the same with and without exceptions, so I'm not sure of the applicability here... this just feels like such a corner case :/.

(The actual reasonable complaint to make about "zero cost" exceptions--which this article didn't make, because it seems that it didn't occur to the author to do an actual timing analysis, which is where I was hoping this article would go--is that the exception tables tend to get mixed in with the rest of the normal functions, which massively pads out the code space, which will lead to many more TLB misses and page fetches while executing the happy path. There should be a way to move these landing pads all to the end of the executable to be loaded into memory only if absolutely necessary; but, if that's already possible, I personally don't know how to do that :(.)

bregma · on Nov 12, 2022

In an ELF binary the exception tables are not intermingled with executable code but are coalesced into the .eh_frame section loaded in a separate read-only non-executable segment. The happy path is truly just the happy path standing alone.

saurik · on Nov 12, 2022

Sorry, I don't know all of the correct terminology. Is it the "landing pads"? Here is the code that is generated for main (after removing that constructor so it ends up without he extra call), and this is all in the text segment... and this definitely isn't "the happy path standing alone".

    0000000000401160 <main>:
      401160:       53                      push   %rbx
      401161:       e8 8a 00 00 00          callq  4011f0 <construct>
      401166:       e8 85 00 00 00          callq  4011f0 <construct>
      40116b:       e8 80 00 00 00          callq  4011f0 <construct>
      401170:       e8 8b 00 00 00          callq  401200 <destruct>
      401175:       e8 86 00 00 00          callq  401200 <destruct>
      40117a:       e8 81 00 00 00          callq  401200 <destruct>
      40117f:       b8 08 00 00 00          mov    $0x8,%eax
      401184:       5b                      pop    %rbx
      401185:       c3                      retq   
      401186:       48 89 c7                mov    %rax,%rdi
      401189:       e8 52 00 00 00          callq  4011e0 <__clang_call_terminate>
      40118e:       48 89 c7                mov    %rax,%rdi
      401191:       e8 4a 00 00 00          callq  4011e0 <__clang_call_terminate>
      401196:       48 89 c7                mov    %rax,%rdi
      401199:       e8 42 00 00 00          callq  4011e0 <__clang_call_terminate>
      40119e:       48 89 c3                mov    %rax,%rbx
      4011a1:       e8 5a 00 00 00          callq  401200 <destruct>
      4011a6:       e8 55 00 00 00          callq  401200 <destruct>
      4011ab:       eb 18                   jmp    4011c5 <main+0x65>
      4011ad:       48 89 c7                mov    %rax,%rdi
      4011b0:       e8 2b 00 00 00          callq  4011e0 <__clang_call_terminate>
      4011b5:       48 89 c7                mov    %rax,%rdi
      4011b8:       e8 23 00 00 00          callq  4011e0 <__clang_call_terminate>
      4011bd:       48 89 c3                mov    %rax,%rbx
      4011c0:       e8 3b 00 00 00          callq  401200 <destruct>
      4011c5:       48 89 df                mov    %rbx,%rdi
      4011c8:       e8 93 fe ff ff          callq  401060 <_Unwind_Resume@plt>
      4011cd:       48 89 c7                mov    %rax,%rdi
      4011d0:       e8 0b 00 00 00          callq  4011e0 <__clang_call_terminate>
      4011d5:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
      4011dc:       00 00 00 
      4011df:       90

The issue isn't the non-executable code: it is that it generates a lot more executable code that it intermixes with all of the rest of the code next to the code that relies on it, which is great for the performance of quickly being able to jump into the exceptional case but heavily dilutes the code cache and implies many more page faults and TLB misses.

bregma · on Nov 18, 2022

The happy path is 401160 through 401185. The unwinding code no more dilutes the code cache than any other subsequent and possibly unrelated functions in memory.

The unwinding code is not there to make it faster since it's generally reached after walking the stack twice. It's moved out of the happy path to make the happy path faster since it doesn't dilute the instruction cache and doesn't hit the pipeline.

mrjin · on Nov 12, 2022

There is much more than the superficial code analysis. Especially on Windows, there is a whole framework of how exceptions are handled etc.. To me, exceptions are super expensive, we should avoid if we can.

saurik · on Nov 12, 2022

The OS should normally have no involvement in C++ exceptions, and that certainly isn't the case on any operating system except Windows, where you can request that exceptions are bridged with structured exception handling, but I honestly don't believe that is common anymore, and even if you do opt into that (which I will argue you should probably never do, though I've sometimes been shocked that people still give a shit about the MSVC ABI :/), I also do not believe it is going to add a direct runtime cost to the happy path on 64-but Windows (which is all that we are analyzing here anyway, as the legacy ABIs usually used for 32-bit Intel systems definitely had a non-zero cost) as I am quite confident it is also implemented with tables.

gpderetta · on Nov 15, 2022

For a while C++ exceptions on MSVC were implemented on top of the OS SEH. People still have scars about catch(...) catching segmentation faults. This has changed more than a decade ago though and has never been the case for 64bit code.

rr808 · on Nov 12, 2022

I think the move to FP cements my view that exception handling is more trouble than its worth. Golang is good too that it doesn't use exceptions except in the extreme. Java with checked exceptions was kinda fashionable for a while but is now unloved. Down with exceptions!

sanxiyn · on Nov 12, 2022

I am not sure what FP you are talking about. Exception is core feature of ML and pervasive in OCaml standard library, for example.

hinkley · on Nov 12, 2022

I haven’t been watching the benchmarks lately but for quite some time OCaml was notable for being nearly as fast as C. No other function friendly language was even close.

yCombLinks · on Nov 12, 2022

I don't see how wrapping everything in Try is significantly different than wrapping things in catch blocks at the end of the day. I see more people complaining about checked exceptions than I actually see them being used anyways.

gpderetta · on Nov 15, 2022

At the end of the day it always boils down to continuations.

pencilguin · on Nov 12, 2022

Confirmation bias is always common, but not a thing to boast of.

rustybolt · on Nov 12, 2022

> Let us now assume that we hide the function definitions:

    extern void construct(), 
    destruct();

> The generated code in that case becomes noticeably longer

This was hard for me to follow, but I think that when the function definitions are removed, the compiler no longer knows that the constructor can't throw an exception, so now it has to add exception handling code?

elteto · on Nov 12, 2022

Aside: this made me chuckle:

    extern void construct(), destruct();

Never occurred to me.

pencilguin · on Nov 12, 2022

  void construe(), destroy();

userbinator · on Nov 12, 2022

Even in the first, "most optimised" case, the compiler is still generating more instructions than necessary. It should only need to generate one mov and one ret.

pjfin123 · on Nov 12, 2022

My strategy is increasingly to not worry about optimization and instead write good semanticly correct code that the compiler can optimize.

mrjin · on Nov 12, 2022

Took a glance, it seemed the author took exception handling as simple as try {} catch{} and thus never discuss OS' role in exception handling.

But one of the most expensive part in user mode exception handling is context switches between user mode and kernel mode.

ishitatsuyuki · on Nov 12, 2022

It is… not? It simply invokes the unwinder / exception handler, like _Unwind_RaiseException. For normal exceptions, no signals are involved. You can get an idea of what is executed at http://stffrdhrn.github.io/software/toolchain/openrisc/2020/....

tsimionescu · on Nov 12, 2022

Are you referring to interrupts by any chance? The OS has nothing whatsoever to do with C++ exceptions in user-mode code.

bregma · on Nov 12, 2022

Usermode exception handling performs no system calls. Perhaps you're thinking of user handling of system exceptions on Windows, which are pretty much unrelated to C++ exceptions?

pjc50 · on Nov 12, 2022

Is that incurred for user-initiated exceptions (not traps or Windows "SEH")? In which implementations?