It's lock contention that slows things down more than anything.
But it's really an 'it depends' situation.
The fastest algorithms will smartly divide up the shared data being operated on in a way that avoids contention. For example, if working on a matrix, then dividing that matrix into tiles that are concurrently processed.
> It's lock contention that slows things down more than anything.
It's all flavors of the same thing. Lock contention is slow because sharing mutable state between cores is slow. It's all ~MOESI.
> The fastest algorithms will smartly divide up the shared data being operated on in a way that avoids contention. For example, if working on a matrix, then dividing that matrix into tiles that are concurrently processed.
Yes. Aka shared nothing, or read-only shared state.
But it's really an 'it depends' situation.
The fastest algorithms will smartly divide up the shared data being operated on in a way that avoids contention. For example, if working on a matrix, then dividing that matrix into tiles that are concurrently processed.