I think we will get there in the end but it will be slow.
When I was doing performances stuff: Intel was our main platform and memory consistency was stronger there. We would try to write platform-agnostic multi threading code (in our case, typically spin-locks) but without testing properly on the other platforms, we would make mistakes and end up with race conditions, accessing unsynchronized data, etc.
I think Python will be the same deal. With Python being GIL'd through most of its life cycle bits and pieces won't work properly, multi-threaded until we fix them.
When I was doing performances stuff: Intel was our main platform and memory consistency was stronger there. We would try to write platform-agnostic multi threading code (in our case, typically spin-locks) but without testing properly on the other platforms, we would make mistakes and end up with race conditions, accessing unsynchronized data, etc.
I think Python will be the same deal. With Python being GIL'd through most of its life cycle bits and pieces won't work properly, multi-threaded until we fix them.