That section is short and I encourage you to read it in full. Here's what it says about how removing the GIL affects existing Python code:
• Destructors and weak reference callbacks for code objects and top-level function objects are delayed until the next cyclic garbage collection due to the use of deferred reference counting.
• Destructors for some objects accessed by multiple threads may be delayed slightly due to biased reference counting. This is rare: most objects, even those accessed by multiple threads, are destroyed immediately as soon as their reference counts are zero. Two places in the Python standard library tests required gc.collect() calls to continue to pass.
That's it with respect to Python code.
Removing the GIL will require a new ABI, so existing C-API extensions will minimally need to be rebuilt and may also require other changes. Updating C-API extensions is where the majority of work will be if the PEP is accepted.
This will be an opt-in feature by building the interpreter using `--disable-gil`. The PEP is currently targeted at Python 3.13.
> existing C-API extensions will minimally need to be rebuilt and may also require other changes.
This vastly understates the work involved. Most C extensions and embeddings will require major structural changes or even rewrites. These things are everywhere and are a major reason for python's popularity. A typical financial instition, for example, will have a whole bunch of them, with a morass of python code built on top. Many of these companies took ages to transition to python 3 (and many still have pockets that haven't!). For them to remove GIL reliance from their C extensions is a much bigger ask than migrating to 3 was, and their response to being asked to do it is likely to be blunt.
The fact that only a small minority of developers involved in python directly use the C API (and fewer understand the consequences of GIL removal for them) means the issue tends to be overlooked in discussions like this, but it's the reason a GILectomy will be worse than 2 to 3. In practice it's likely to end up being either a fork or a mode you have to switch on/off, both of which would be miserable for everyone.
The GIL is part of python's success story. It made it easy to write extensions, and the extension ecosystem made the language popular. Every language doesn't have to converge to the same endpoint. Different tools are suited to different jobs. Let it be.
First, if the extension isn’t being used in a multithreaded environment, nothing should change. Yes, it isn’t thread safe, but it doesn’t really matter in that context. And given how bad GIL-Python works with threads, I doubt the majority of extensions are written for multithreaded applications.
And the ones that are written for multithreaded are probably already releasing the GIL for long running computations, so they should be written with at least a little bit of thread safety in mind. And in the worst case it shouldn’t be too difficult to hack the code to include a GIL-like lock that needs to be held by library users in order to ensure thread safety without really changing the architecture of the extension.
There is no way of knowing whether a Python module is used in multithreaded environments or not. The point is that C extensions that previously worked fine in multithreaded environments probably will not work fine in multithreaded environments with the GIL disabled.
I'm sure you know this, but as a general psa: threads are the lowest common denominator way of doing this kind of thing. Where possible, it is nicer to use epoll/kqueue/whatever the windows equivalent is. For a library though I totally get how annoying it would be to interface with the platform event system on each os. Indeed, if there isn't already a python library that can handle subprocesses for you on a single thread then it might be something worth building (I don't use python personally so I don't know the ecosystem).
> Most C extensions and embeddings will require major structural changes or even rewrites.
A lot of those important "extensions" are actually C and C++ libraries designed for parallelism in their native use which have been made available in Python via bindings (e.g. Pytorch). The cores of these libraries are fine, only the binding layers (often automatically/semi-automatically generated) may need to be updated. I suspect only the C-API extensions designed from scratch to only be extensions are going to have major problems with a no-GIL world.
> Every language doesn't have to converge to the same endpoint. Different tools are suited to different jobs.
People have decided they want to use Python frontends. Plenty of alternative languages could have won out, which would have provided native threading, faster runtimes etc, but they didn't; we are stuck with Python. The continued existence of the GIL is incredibly restrictive on how you can leverage parallelism both in Python itself and in extensions. Only in the simple cases can you make the Python<->extension boundary clean: the second you want to be able to call Python from extension code (e.g. via callbacks or inheritance) the GIL stands in your way.
For every popular, well-engineered extension, I suspect there are a dozen hacked-together ones which will break the moment the GIL guarantees disappear.
I'm sure there are a lot of hacky extensions out there (particularly hiding in proprietary codebases), but letting the entire future of Python be held hostage to the poor SWE choices of third parties who almost certainly contribute nothing back is not a sustainable path.
But it is a programming language. You do not break backwards compatibility lightly. Nobody has ever chosen Python for its runtime performance. Unfortunately, sometimes you have to accept the technical debt cannot be escaped.
> Nobody has ever chosen Python for its runtime performance.
No, they choose it for the ease of using its performant extensions. And those extensions are fundamentally limited in performance by the existence of the GIL. And the authors of those extensions (and their employers) are behind the work to get rid of it.
There are three groups here:
1. "Pure" Python users, whose code makes little/no use of extensions. GIL removal is an immediate win for these uses.
2. "Good" extension users/authors, whose code will support and benefit from no-GIL. GIL removal is an immediate win for these uses.
3. "Bad" extension users/authors, whose code doesn't/can't support no-GIL. GIL removal probably doesn't break existing code, but makes new uses potentially unsafe.
Maintaining the technical debt of the GIL indefinitely so that group 3 never needs to address its own technical debt is not a good tradeoff for groups 1 and 2 which actively want to move the language forwards.
The above arguments could apply to any breaking changes that a host wants to implement. Those who don’t directly consume the change, those who can adapt, and everyone who is fine with the status quo.
Python is free to do as it pleases, but this breaking change is going to result in a lot of churn.
> GIL removal is an immediate win for these uses.
As a minor aside, the GIL free Python version was actually a performance regression, somewhere between 5-10% impact. https://peps.python.org/pep-0703/#performance . So, an immediate loss for nearly everyone.
No, this is false. Relying on the GIL isn't bad practice in a python extension. It's something you have no choice but to do. The GIL is the thread safety guarantee given to the extension by the python intepreter. It's the contract you have and you have no choice but to code to it. Removal of the GIL requires an alternative contract representing a finer-grained set of thread-safety guarantees for people to code against. Programmers who coded against the contract that existed rather than somehow divining the future and coding against a different contract that hadn't been invented yet (while also managing to make it compatible with the existing one) weren't doing something wrong.
But besides being incorrect, the idea that extensions that break on removal of the GIL are "bad" is irrelevant. The question is what the ecosystem will bear. The people who own the codebases I'm talking about are not pushing for GIL removal. They have large, working codebases they're using in production and the cost of redesigning large chunks of them would be real. GILectomy is being pushed by people with varied motives but they certainly don't speak for everyone and their case isn't going to be made successfully by casting aspersions on those with different priorities.
Relying on a published guarantee of CPython is not technical debt!
What will happen is that extensions that don't work with the new behavior will be assimilated by the proponents. Support for the new behavior will be hacked in, probably with many mistakes.
The bad results will then be marketed as progress.
it is now! just like that bill you didn't know you owed until it came in the mail, technical debt builds up regardless of when you know when it's owed.
keeping a gil when entering and exiting c extensions seems like an obvious steppingstone.
That doesn't contradict what the GP said. Those modules are largely written in C, not Python. And that is because nobody would choose Python for its performance.
> And that is because nobody would choose Python for its performance.
Python the runtime may not be the most performant by itself, but python the ecosystem only took of because of performance hacks like those popular modules, pypi, cython, ... .
> Those modules are largely written in C, not Python.
Yet people choose C to implement those modules because Python is not performant enough and isn't that a nice thought? Lets do something good for humanity and drive another stake into that old abominations heart by making it less relevant.
What? Pypy and cython are hardly the reason for python "taking off".
The issue with python is that the ecosystem is so big, everyone thinks their niche is the one python was "built for". Data science, web backends, sysadmin, app scripting, superclusters, embedding... They all think they're the biggest dog, when the reality is that they're more or less all the same. Pypy and cython are probably popular in some of those niches, but python is not just about them.
The reality is that python "took off" because of two things:
- the syntax
- the ease of interfacing with other worlds, typically (but not limited to) C/C++
Removing the GIL seriously threatens the second reason, because integrations that used to be fairly easy will have to be fundamentally rearchitected. Moving from Python 2 to 3 was trivial in comparison, and it took a decade; this transition might take much longer, or never really happen, with massive losses for the ecosystem. And all for what, some purist crusade that interests only a fraction of the whole community...?
Come now, how many people compile python to use it?
How many people will be surprised / upset that they have to do that once `--disable-gil` is mainlined into the default builds on python.org?
How long before 'build it yourself and use --disable-gil' becomes 'we've enabled --disable-gil by default for Great Victory for Most Users'?
I think it's very very naive to think that the flag will remain an obscure 'use if you want to build python yourself' if the PEP is successful.
> The global interpreter lock will remain the default for CPython builds and python.org downloads.
Is a joke. It means, for now. For the work in this PEP.
I mean, you've literally got the folk from numpy saying:
> Coordinating on APIs and design decisions to control parallelism is still a major amount of work, and one of the harder challenges across the PyData ecosystem. It would have looked a lot different (better, easier) without a GIL.
That isn't 'for a few people who are vaguely interested in obscure technical stuff'; this is the entire python data ecosystem, saying "this will be the default in the future because the GIL is a massive pain in the ass".
> If it doesn't work for you, use the GIL enabled.
Sure, it can be opt-out; but you're fooling yourself if you think this is going to be 'opt-in'.
It will be the default once it's completed, sooner or later, imo.
...and when the 'default' build of python breaks c-extensions, that is a breaking change, even if technically you can build your own copy of python with different compile flags.
Long term? I doubt it. Who’s going build those packages?
Two major parallel incompatible implementations of Python, maintained at the same time, duplicating all efforts at testing, dev, build, release train, etc. while both are maintained.
Does it sound familiar?
Even ignoring the obvious parallels to Python 3000, there’s a limited amount of time people will be bothered maintaining both.
Probably in the long-term the two build-mode will be merged and there will only be a "nogil" build-mode.
Which doesn't mean there will be no GIL. Contrary to popular belief the goal of PEP 703 is not to remove the GIL, but only to be able to disable at runtime if you need free threading, so it is an opt-in feature.
In the interim, it seems that Conda have volonteered to build the extensions that are frequently used in the scientific community.
Could you elaborate on how the Python ecosystem will suffer if these financial institutions get left behind? What sorts of contributions come from these users in particular?
Have you seen PIP, conda, asyncio from 3.4 to 3.8, the standard library documentation, the great 2 to 3 migration, type "annotations" and mypy, the variable scoping "rules" or the multiprocessing module?
We python users are more than ready for whatever pain we have to deal with in the future, we are used to misery we live it every day.
True Pythonistas love misery, almost as much as we love kvetching. Without the misery, we'd have nothing to talk about. Don't get me started on typing.
> Most C extensions and embeddings will require major structural changes or even rewrites.
Sam claims otherwise: "Most C API extensions don’t require any changes, and for those that do require changes, the changes are small. For example, for “nogil” Python I’m providing binary wheels for ~35 extensions that are slow or difficult to build from source and only about seven projects required code changes (PyTorch, pybind11, Cython, numpy, scikit-learn, viztracer, pyo3). For four of those projects, the code changes have already been contributed and adopted upstream. For comparison, many of those same projects also frequently require changes to support minor releases of CPython."
I'd also like to point out that the GIL does not inherently make C extensions thread safe. It only protects the interpreter itself. For example, calling in to the C API can already release the GIL as I explained in reply to /u/bjourne here:
The standard build that you get at python.org will work exactly as now (so with a Gil).
The disable-gil build (which will most likely be available through conda or similar) will have the option to run either with the GIL or without. It will automatically detect extension that are not compatible and switch to Gil mode but you can override this with an environment variable.
> This vastly understates the work involved. Most C extensions and embeddings will require major structural changes or even rewrites.
Why?
Once you have a threaded Python, you can allocate one thread to run with a GIL and the other threads without it. Old extensions can access one thread with the old GIL API and new extensions can access all the things with the GIL-less API.
People who want the performance will rewrite their extension. People who don't, won't.
I don't think that is possible to activate the GIL only for one thread, it has to be activated process-wide.
But, if you want free-threading but you have to use an old extension that is not GIL-less, then you still can, by garding all access to this extension with a lock and setting the PYTHONGIL variable to 0.
But, you will need to compile the extension with --disable-gil or to obtain that from somewhere, if the extension author doesn't provide it, as the ABI of "standard" python and "--disable-gil" python is different.
How --disable-gil builds of python and of extensions will be distributed is not yet totally clear. I don't think pypi has already a clear idea of how to handle those extension, for instance.
That's not quite good enough - people use threads to e.g. permit multiple blocking i/o operations to be in flight at once, which works perfectly well right now. So you're going to want to support multiple GIL threads if you want to be able to support existing code safely.
Sadly, you can't just trivially bolt on thread safety to code that was not designed with it in mind.
However, if you go the other way and introduce e.g. an @nogil decorator, similar to that seen in some cpython alternatives, people will have a straightforward path to opt in incrementally as they fix and verify the critical parts of their code, while preserving known working behaviour elsewhere without having to throw entire complex systems over the wall at once.
would a fork be so bad? maybe I'm off but I feel like most people have no real need for GILless and would leave it off if they were even aware it existed. Those who would need it the most are the most equipped to handle the cost of adoption.
And the probability of a given piece of C code that was written for a thread safe environment working when called from multiple threads at once is pretty low for anything not 100% purely functional.
Honestly, that’s not my biggest worry. My bigger worry is that now Python variables your extension is reading can be changed by another thread in real time. Before if you held the GIL, you knew nothing would change while you poked some Python datastructures.
That's not the implicitness I'm talking about. While it sounds like loading old C modules will just re-enable the GIL, the problem is that they will never be updated to not rely on the old Python concurrency model. All that C code was written implicitly assuming that certain blocks of code were surrounded by a GIL.
It could be a real headache for any hoped-for transition to nogil Python if lots of GIL-reliant C code is floating around where there's little hope of updating it without having to worry about subtle bugs popping up. And even if the conversion was risk-free (which I doubt), many organizations will still not want to dig into their legacy C codebases and make significant changes.
I'm the author of a Python C extension, and having the GIL gone will be a lot of work. Code currently looks like this:
* C function called with GIL held
* Extract data needed to do work
* Release GIL
* Do work
* Reacquire GIL
* Modify data, build result
* Return
As an example of the changes, a list could be passed in. I would need some form of locking while processing that list so that mutations while processing won't crash the code.
The GIL does currently result in robust code by default because data can't mutate underneath you. Without the GIL the code will appear to work, but it will be trivial for an attacker to use mutations to crash the code. Expect huge numbers of CVEs.
Nothing between Release GIL and Reacquire GIL needs to change. Depending upon your extension, possibly nothing needs to change for the other steps either. Per Sam Gross:
> Most C API extensions don’t require any changes, and for those that do require changes, the changes are small. For example, for “nogil” Python I’m providing binary wheels for ~35 extensions that are slow or difficult to build from source and only about seven projects required code changes (PyTorch, pybind11, Cython, numpy, scikit-learn, viztracer, pyo3). For four of those projects, the code changes have already been contributed and adopted upstream. For comparison, many of those same projects also frequently require changes to support minor releases of CPython.
If you're using the GIL to protect access to non-Python objects, that will need to change.
The PEP mentions a future HOWTO on how to updating existing extensions. I wish that were already written.
There's disagreement among Python maintainers in the discussion thread on the PEP in how much work will be involved and I don't expect to resolve it here.
As far as I know PEP703 includes provisions to make operations on containers thread-safe. That should at least avoid most crashes.
Borrowed references can be more problematic, but it seems most cases could be fixed by replacing GetItem with FetchItem calls.
Overall, as another C Python extension, I don’t really think it’s going to be that much of a pain. In fact, I could even get away with no changes (other than build fixes and such) if, for example, I guarantee that each “main object instance” is only accessed by one thread, and I’d still get a lot of benefits from the nogil.
Whether the individual operations are thread-safe is irrelevant. For the code to be thread-safe it needs to acquire lst's lock before the if statement and release it afterwards.
Your example is not thread safe because the GIL does not protect critical sections like that. The GIL only protects the Python internal interpreter state. PyList_SetItem discards a reference to the item being replaced. If the ref count of the item becomes zero, the item's destructor is run. The destructor can release the GIL.
IOW, your example is not much different than the equivalent Python code:
The "Defining Extension Types: Tutorial" also includes this warning in an example that uses Py_XDECREF:
But this would be risky. Our type doesn’t restrict the type of the first member, so it could be any kind of object. It could have a destructor that causes code to be executed that tries to access the first member; or that destructor could release the Global interpreter Lock and let arbitrary code run in other threads that accesses and modifies our object.
Yes, you are absolutely right, but that doesn't change the fact that many extensions are written as if innocuous function calls like PyList_SetItem won't context switch. It mostly works fine since custom destructors releasing the GIL are extremely rare.
I dunno, I'd rather have a reliable bug than a heisenbug.
Any class with a __del__() method is going to release the GIL. That's not uncommon. My intuition is that threaded Python is rarer than classes with a __del__() method.
I don't know about the complexity of your extension, but the PEP provides per-container locks ("This PEP proposes using per-object locks to provide many of the same protections that the GIL provides. For example, every list, dictionary, and set will have an associated lightweight lock. All operations that modify the object must hold the object’s lock").
This is automatic via the PyList_GetItem/SetItem API, so I guess the error you're talking about is that you read a list y = [A, ...], your code reads A and copies the data to (new) A2, and then you iterate over it again and see that A != A2 because another thread has modified y?
> This is nothing like the Python 2 > 3 transition.
Well what py2to3 promised was also not what happened and by far one of worst disasters of a migration in history of open source so you can't blame people for being skeptical
He, the Perl 5 to 6 migration is already forgotten? I think this says something about how well it went, because at the time Perl was more popular than Python.
There was no Perl 5 to Perl 6 migration. Perl 6 was announced, a bunch of design work happened on it, and then it became a different language run by different people rather than a version of Perl. People are still writing Perl code extensively, and Perl 5 is still maintained.
I would expect Python to follow a similar language split. People were disappointed at how few breaking changes occurred in 2->3. If there was going to be a 3->4 migration, will be a large number of proposals to correct other deficiencies in the language.
... I mean if you want to compare that it had exactly same problem so it is another evidence it's not something specific to Python, it's just utterly terrible way to go ahead.
Only difference is that they noticed it's a bad idea midway and went back to developing Perl 5
And I still can write "use v5.8" in my Perl script and write script that works in Centos 5 instance from 2007, if I needed to do something on legacy system, and it will work the same on latest Perl.
> because at the time Perl was more popular than Python.
I don't believe that was true at the time as P6 took like decade to decide on semantics and only then the implementations started to happen. More existing code, sure, but much of the web dev moved to PHP/Ruby and many other things to Python.
And unlike Py3 it was initially also much slower than P5, making migration kinda worthless.
Maybe I am off base here, but big migration efforts feel like they would be significantly easier in a compiled language. Potentially not a fair comparison when the tooling can automatically push so much code without errors.
> Posting a Wikipedia link without commentary or context is essentially a text meme. It doesn’t invite discussion.
You literally just replied and we are engaging discussion — empirically the wiki link above has done literally the opposite of what you are saying! At any rate, the intended relevance of the wiki link regarding loss aversion is the following:
When something bad happens to someone, they tend to overestimate how bad it is relative to something experienced as objectively equally good; it is thus important to be aware of such a bias — in addition to using regular old critical thinking.
In this case, critical thinking could look like the following: “although this is a ‘migration’, what do we mean when we say that? and is it really the same thing as py2 to py3?”
I would tentatively propose “no” to the latter question; and would additionally propose that calling this a “migration” is not terribly useful as although it’s not incorrect, it’s insufficiently specific and seems to invite a category error.
In addition to the loss aversion bias mentioned above.
> Destructors and weak reference callbacks for code objects and top-level function objects are delayed until the next cyclic garbage collection due to the use of deferred reference counting.
this actually does "break" a lot of things, as you would be surprised how much code implicitly relies upon cPython's behavior of immediately calling weakref callbacks when an object is dereferenced. This is why keeping test suites running on pypy can be difficult, because it has the latter behavior.
> Removing the GIL will require a new ABI, so existing C-API extensions will minimally need to be rebuilt and may also require other changes. Updating C-API extensions is where the majority of work will be if the PEP is accepted.
as you note, there will be *two* versions of the Python interpreter.
that means every C extension has to be built *twice*. against both versions of the interpreter. Go look at how many files one must have available when publishing binary wheels: https://pypi.org/project/SQLAlchemy/#files the number of files for py3.13 now doubles. not clear if we actually have to have both Python builds present, so I would have like /opt/python3.13.0.gil and /opt/python3.13.0.nogil ? if the gil removal changes almost nothing, why have two versions of Python?
I don't see why you couldn't use updated "No GIL" extensions with the GIL interpreter. The "No GIL" interpreter mode will simply abort if you load an unsupported C extension.
It's worth highlighting that the GIL will still be available even when compiled with --disable-gil.
> The --disable-gil builds of CPython will still support optionally running with the GIL enabled at runtime (see PYTHONGIL Environment Variable and Py_mod_gil Slot).
You've conveniently described python behavior but lots of C code relies on the gil implicitly and will need to add locking to be correct in a nogil world.
I'm not saying for sure this is bad! I do think it is dishonest though about the potential impact. Lots of critical libraries are written in C.
This is why the proposal is to, by default, reenable the GIL at runtime (and print a warning to stderr) whenever a C extension is loaded, unless that extension explicitly advertises that it does not rely on the GIL.
Dishonest? I didn't hide that fact: "Removing the GIL will require a new ABI, so existing C-API extensions will minimally need to be rebuilt and may also require other changes. Updating C-API extensions is where the majority of work will be if the PEP is accepted."
Yeah but everyone knows that is obviously the biggest issue. Nobody was really concerned about the Python code which is the part that you said will go swimmingly—that's table stakes for a change at all. We already assumed the pure Python code would upgrade easily; a change would have no chance whatsoever of being accepted otherwise. Everyone is worried exclusively about the C extensions, and always has been. This seems to have been presented as a new approach to GIL removal that fixes the problem with C extension breakage but it's just the exact same old approach we've always been considering that breaks the C extensions. No-GIL ain't done until all the C extensions run, especially NumPy.
Most of the other comments here at the time were similarly from people who clearly hadn't read the PEP saying it would break all existing Python code.
So I did my best to represent the backwards compatibility section of the PEP. I told folks to go read it and linked to it. I cited the portion relevant to Python code. There were too many bullet points for the C-API, so I summarized it with the disclaimer "Updating C-API extensions is where the majority of work will be if the PEP is accepted."
I also read the PEP discussion thread where there was disagreement from Python maintainers on how much work would be required of C extension authors, but most of the folks stating it would be a lot of work hadn't seemed to actually have tried to port anything to the new API. Meanwhile Sam had asserted that:
> Most C API extensions don’t require any changes, and for those that do require changes, the changes are small. For example, for “nogil” Python I’m providing binary wheels for ~35 extensions that are slow or difficult to build from source and only about seven projects required code changes (PyTorch, pybind11, Cython, numpy, scikit-learn, viztracer, pyo3). For four of those projects, the code changes have already been contributed and adopted upstream. For comparison, many of those same projects also frequently require changes to support minor releases of CPython.
I don't think I've misrepresented anything.
> No-GIL ain't done until all the C extensions run, especially NumPy.
I think the word "minimally" makes it sound like the changes to existing libraries are "to an extremely small extent; negligibly." "At a minimum" would fit better, because the minimum of a set of things can still be very large whereas minimally implies the quantity is very small.
But it was downplayed in the comment when it was always the #1 reason for keeping the GIL. Python level changes were never a serious part of the argument.
What do you propose as an alternative? I write code in a lot of languages and I can't think of a single one where I don't have to consider the version. This applies to C, node, ruby, swift, gradle/groovy and java at least. Even bash. When developing for Android and iOS, I have to consider API versions.
Almost every third party ML model I look at seems to have different versions, different dependencies, and requires deliberate trial and error when creating container images. It's a mess.
Having interpreters and packages strewn across the machine is a nightmare. The lack of standard tooling has created a lawlessly dangerous wild west. There are no maps, no guardrails, and you have to beware of the hidden snakes. It goes against the zen of python.
As a counter example, Rust packs everything in hermetically from the start. Python4 [1] could use this as inspiration. Cargo is what package and version management should be, and other languages should adopt its lessons.
[1] Let's make a clean break from Python3 even if we don't need a new version right now.
The ML community has horrendous engineering practices. Everyone knows this. This isn’t the fault of Python, nor should Python cater to people who build shoddy scaffolding around their black boxes.
I mean, you're not entirely wrong but Python really really doesn't make it easy.
Consider R, which is filled with the same kind of people. There's one package repository and if your package doesn't build cleanly under the latest version of R, it's removed from the repo.
Don't get me wrong, this has other problems but at least it means that all packages will work with a single language version.
> I mean, you're not entirely wrong but Python really really doesn't make it easy.
That's a vast exaggeration. It is not "really really" hard to spin up a venv and specify your requirements. People just don't do it, and blame the tools for what are bad engineering practices agnostic to any language.
"Really really" not easy would be handling C, C++, etc. dependencies.
Generally that is a straight forward process of compiling, reading the error message, googling “$dist install $dirname of missing dep” running the apt-get / emerge / yum “ command and then repeating the compile command. Sometimes people will depend on a rare and not bundled dep, but not that often. Worst case you need to upgrade auto make tool chain or rebuild boost or something.
Maybe more time than getting python deps to work but more deterministic and takes less cleverness.
I work in data science in python (and the parent was about ML) and basically everything in that space has C and Fortran level dependencies and this is where Python is really really bad, so no it is not as simple as you're making out.
I really really wish it was, as then I wouldn't have had to learn Docker.
Python is a much older and generalist language than R, so yes, while it would be great to impose this kind of order on things, it’s not practical for its current extent of use.
That being said, after two decades of using Python professionally, the only really problems I’ve ever encountered are “package doesn’t support this version for {reasons}” and “ML library is doing something undocumented and/or dumb that requires a specific Python version.” The former is normally because the package author is no longer maintaining their package and the latter is because, again, the ML community is among the absolute worst at creating solid tooling.
I don't disagree that Python's place in the ecosystem ("generalist" - i.e. load-bearing distro fossilization in everything from old binary linux distros, container layers, SIEM/SOAR products, serverless runtimes...) leads to much packaging complexity that R just doesn't have
However, Python (1991) is only 2 yrs older than R (1993)
Rust and Node (via nvm) feel good. The worst I run into is “this version of node isn’t installed” and then I just add it. And I don’t have to worry about where dependencies are being found. Python likes to grab them from all over my OS.
I use direnv and pyenv. When I cd to a repo/directory, the .envrc selects the correct Python and the directory has its own virtual environment into which I install any dependencies. I don't find that Python grabs packages from all over the OS.
pyenv works locally, no matter what the project opts to use. The only thing it needs for a project 'to be managed' is a .py-version file, which you can throw in .gitignore
It doesn't matter what you do. The vast majority of code I'm using from other people doesn't. Even my personal python methodology differs from yours.
Plus, you now have to teach and evangelize your method versus the dozens of others out there. It's crazy town.
The negative thoughts and feelings I once had for PHP are now directed mostly at Python. PHP fixed a lot of its problems over the last decade. Python has picked up considerable baggage in that time. It needs to take the time to do the same cleanup and standardization.
I was describing a workflow that works for me to someone who didn't seem to have found an effective Python workflow in hopes that it can work for them too. I work across a variety of languages and none that I've worked with doesn't have some issue that I can't complain about[1]. I personally don't find Python all that painful to work with (and I've been working with it since 1.5.2), but I understand my experience is not universal.
[1] If it's not the language, it's the dependency manager. If it's not the dependency manager, it's the error handing. If it's not the error handling, it's the build process. If it's not the build process, it's the community. If not the community, the tooling. Etc. I have some languages I like more and some less. Mostly it comes down to taste. I'm not here to apologize for or defend Python. I'm only here to describe how I use it effectively, and to correct what I thought were inaccuracies with respect to removing the GIL.
I use direnv because I work with many languages and repos and I don't want each language's version manager linked into my shell's profile. As well, direnv lets me control things besides the language version. Finally, direnv means I don't have to explicitly run any commands to set things up. I just cd to a directory.
FWIW, I don't think it's nice that rustup fetches and installs new versions without prompting, but I suppose that other users like it or get used to it. Fortunately most Rust projects work on any recently stable version.
> rustup fetches and installs new versions without prompting
I don't think it's true. rustup installs new version only when you run `rustup update`. What parent is talking about is pinning a particular rustc version in Cargo.toml, which allows rustup to download that version of rustc to build that particular project/crate.
rustup will automatically download that version when you interact with that project, though, and that's what I mean. It doesn't sit right with me, comes as a surprise, but I guess it's not the biggest issue in the world.
Node do allow you to declare what node version supported in your package.json. The definition is there, but there isn't any tool that read the declaration and switch to it accordingly. I feel it is somewhat half-assed. But is could also caused by the fact the entity that distribute the package (npm) and node binaries (various of linux repository) isn't the same group of people. So there isn't really anyone can do anything about it unless we get something like corepack someday. (probably someone should name it 'corenode' ?)
Isn't this all handled by pip typically? Even though most models don't necessarily put it in the readme, the user should be using some sort of env manager.
I mean, Java seems like a pretty good alternative? Obviously it's trivially true that programmers have to care about versions, but they've done miracles in the VM without breaking compatibility.
Pragmas seem like the correct way to have done the Python 2->3 migration. Does anyone know of some technical limitation as to why they weren't used? It is very obvious solution in hindsight, but I wasn't there.
I saw some people mentioning changes like changing print statement to print function. That was actually one of the most trivial changes and you could import print_function from __future__ which worked like pragmas.
Similar problem could be with changing behavior for divisions (which actually was more challenging) but similarly you could enable that behavior.
The main problem with migration though was addition of Unicode. You can't just enable it on file by file basis, because once you enable the new behavior in a single file you, will start passing arguments in Unicode to other code in other files and if that code wasn't adapted or will break.
And it was even worse than that because that problem extended to your dependencies as well. Ideally dependencies should be updated first, then your application, but since python 2 was still supported (for a decade after python 3 was released) then there was no motivation to do it.
And if that wasn't enough python 2 already had Unicode support added, but that implementation was incorrect, so even if you imported Unicode_literals from __future__ you potentially broke compatibility with existing python 2 dependencies without guarantee that your code will work on python 3.
IMO that particular change couldn't be done with pragmas, the core issue is that python 3 put a clear separation between text and binary data, but Python 2 mangled them together. That still was true even when you used Unicode in python 2.
The proper way to perform the migration IMO would be to type annotate the code. And then run mypy check in python 3 mode.
Back when Python 3 was initially concieved, the language just wasn’t that widely used, and mostly by enthusiasts. Some breakage wasn’t considered a big deal - it was expected users would easily update their code.
But during the time it took to design and deliver Python 3, the language exploded in popularity and reached a much wider audience and 3rd party libraries like numpy became crucial. So when Python 3 was ready it was a completely different ecosystem which was much harder to migrate. But I dont think the core team really relized that before it was too late.
Asdf makes all of this pretty easy. For consulting I often need multiple versions of everything to match client projects -
Just install them with asdf and put a .toolversions file in the project folder with the desired tooling builds.
You would not need separate packages to do that (in fact, you can't do this with separate packages because dpkg will complain if two packages provide the same file).
The PEP actually states that the nogil version would also have env variable allowing to temporarily enable GIL. Although I guess in practice they might still build separate versions.
As for managing Python library dependencies, I use poetry (https://python-poetry.org), though unfortunately both it and pipenv seem to progressively break functionality over time for some reason.
pyenv is a third party tool that makes some of this easier, notably around creating more than just a virtual env in that you also choose the Python version.
Python is not hard to deal with in this regard I think people are just uninformed.
If you’re installing a Python package into the global site packages directory (ie, into the system Python) you might need sudo. That’s how permissions work.
I don’t know the -u flag on pip, never used it can’t find it in the docs.
With a virtual environment sudo is not needed. Assuming you created it, and/or it is owned by you.
Virtual environments are just directories on disk. They are not complex.
I don’t use conda because it’s never felt even remotely necessary to me.
how about when you are authoring script under your name, but then want to schedule it for cron to run periodically?
I often find myself working under my user on remote server, but then I want to schedule cron job - and run into all sorts of permissions / bugs and missing packages.
especially when multiple machines need to run this script, and I don't want to involve containers to run 20-lines simple python script.
this is why Golang is so popular - you can just scp a single binary across machines and it will just work.
Are you kidding me? The horrendous way Python does dependency management and virtual environments, and the fractured ecosystem around those, is one of it biggest pain points, often covered by core CPython developers and prominent Python third party developers, hardly "misinformed" people.
That comic is very old. In the days of 2.x it was a little harrier but nothing like people make it out to be.
The literal only thing you need to understand is “sys.path”. If you inspect this in a shell you will know what you’re up against. Python is all literally just directories. It’s so easy and yet people get so bent out of shape over it.
Create a venv, activate it, and use pip as normal. If you ever run into issues, look at sys.path. That’s it.
Which is irrelevant. We're talking about the dependencies/packaging/virtual environments situation, not whether "it's easy to be a Python developer" in general.
And you can disagree all you want, but it's simply wrong that Python's packaging/venv ecosystem is "just fine".
There are options to do things other ways, but most of the time I just use venvs and pip for everything.
Is it because people have to use venvs that people complain about it?
I’ll admit being able to install via an OS package manager, vs pip, vs anaconda etc etc can be confusing, but is any of that really Python (the language)’s fault?
I think the primary problem with removing the GIL is the performance from degraded garbage collection. To quote the issue with an implication that it isn’t a big deal because it’s not many words, doesn’t make the case for me. The delay in when garbage collection happens is the only reason to my mind why Java sucks balls as a memory hog vs C-based languages. Process performance is in large part based on when critical lowest latency memory resources can be freed for the top level working object set.
https://peps.python.org/pep-0703/#backwards-compatibility
That section is short and I encourage you to read it in full. Here's what it says about how removing the GIL affects existing Python code:
• Destructors and weak reference callbacks for code objects and top-level function objects are delayed until the next cyclic garbage collection due to the use of deferred reference counting.
• Destructors for some objects accessed by multiple threads may be delayed slightly due to biased reference counting. This is rare: most objects, even those accessed by multiple threads, are destroyed immediately as soon as their reference counts are zero. Two places in the Python standard library tests required gc.collect() calls to continue to pass.
That's it with respect to Python code.
Removing the GIL will require a new ABI, so existing C-API extensions will minimally need to be rebuilt and may also require other changes. Updating C-API extensions is where the majority of work will be if the PEP is accepted.
This will be an opt-in feature by building the interpreter using `--disable-gil`. The PEP is currently targeted at Python 3.13.
This is nothing like the Python 2 > 3 transition.