"Spice shows subpar scalability: The speed-up of using 16 threads was merely ~11x"
If that is true, then "Spice" is suitable only for small tasks, which can be completed at most in milliseconds, which can benefit from its low overhead, while for any bigger tasks something better must be used.
IMO this is the least convincing part of the benchmark though, since it's uninterpretable without an optimal baseline. You don't know how much of this is because of Spice and how much is because of how the task scales. (This is acknowledged as future work.)
If and only if (1-thread Spice - non-parallelized baseline) > 1ns, which their tests back up their claims.
https://github.com/judofyr/spice/tree/main/bench