I was really struck by a comment Jonathan Blow made on stream recently: he said ...

roca · on March 13, 2021

I never wrote a parallel for-loop in 15 years working on Firefox, because it's hard in C++, it's risky and difficult to maintain the thread-safety invariants, and it's not all that useful in most parts of the browser.

I write them quite often in Rust, because Rayon makes it super easy, there is almost no risk because the compiler checks the relevant thread-safety invariants, and I'm working on different problems where data parallelism is much more useful.

snovv_crash · on March 13, 2021

I've used them extensively in C++. Doing it manually by managing your own threads is a pain, but simple OpenMP based parallel loops work really well, and also supports tasks like building vectors and simple reductions.

roca · on March 13, 2021

When your loop body uses complex library APIs over complex data it's still hard to be confident in C++ that everything's threadsafe and you're avoiding data races.

Maybe it's not so hard if you're in a domain like HPC where the libraries you use are designed specifically to be used with data parallelism. But when you're pulling together code from different sources that may or may not have been used in an aggressively parallel application before...

jltsiren · on March 14, 2021

I think it's less about libraries and more about the general approach to programming.

In the HPC world, software is usually doing one thing at a time. Most of the time it's either single-threaded, or there are multiple threads doing the same thing for independent chunks of data. There may be shared immutable data and private mutable data but very little shared mutable data. You avoid situations where the behavior of a thread depends on what the other threads are doing. Ideally, there is a single critical section doing simple things in a single place, which should make thread-safety immediately obvious.

You try to avoid being clever. You avoid complex control flows. You avoid the weird middle ground where things are not obviously thread-safe and not obviously unsafe. If you are unsure about an external library, you spend more time familiarizing yourself with it or you only use it in single-threaded contexts. Or you throw it away and reinvent the wheel.

snovv_crash · on March 14, 2021

If the APIs that you're interacting with are side-effect free then it's easy. If they are full of side effects, then they aren't written with multithreading in mind and you wouldn't be able to even compile it in Rust. C++ just takes off the training wheels.

roca · on March 14, 2021

It's a bit more complicated than that, because code can be thread-safe but not side-effect-free, but basically you're just restating what I said. C++ makes it hard to be sure code is really safe to use across threads, which means in practice developers should be more reluctant to do so.

snovv_crash · on March 18, 2021

If it isn't side effect free then it probably doesn't parallelize very well, even if it has sufficient mutexes etc to be threadsafe.

CJefferson · on March 13, 2021

The world is full of highly parallel programs getting useful work done. Most graphics, AI and compression libraries (picking 3 easy examples I've worked on) parallelize well, and can usually make use of all the cores you can throw at them.

Jonathan Blow makes good games, but chooses not to make particularly CPU intensive ones. That's fine, but that's also his choice.

mariusor · on March 13, 2021

He's also currently building one of the fastest compilers around. It's unreasonable to consider that he never encountered use cases where parallelism makes sense.

memco · on March 13, 2021

Indeed, he wasn’t saying parallelism is not useful, just that the specific construct of a parallel for loop was not in his wheelhouse for certain reasons.

mariusor · on March 13, 2021

My impression of Jon's work is that he requires low enough level access to his hardware so he's the one that makes decisions about where and what runs. Language level parallel for is definitely not that. :D

Jweb_Guru · on March 13, 2021

Parallelism and asynchronous code are not the same, and in the case of Rust they are very much not the same. Parallel for provides massive advantages for many things including game programming (from experience) so with all due respect I think this says more about Jonathan Blow than it does anything about "parallelism still needing to prove itself."

howinteresting · on March 13, 2021

I wrote a parallel iteration (map-reduce) last week in some CPU-heavy code, took 5 minutes with Rayon. Sped my code up by around 10x on a 12-core machine, example benchmark going from 7 seconds to 700 milliseconds. It's serious business.

pixel_fcker · on March 13, 2021

This says much more about the type of programs Jonathan Blow tends to write than anything else.

snovv_crash · on March 13, 2021

Depending on the language, it can be as easy as adding

   #pragma omp parallel for

And if this loop is your bottleneck, you can get almost perfect scaling with cores.