Python: Please stop screwing over Linux distros

elurg · on Nov 16, 2021

Python libraries shipped by distributions are so old that this mechanism is mostly useless for python development.

This also applies to many others programming languages which have their own packaging systems.

While python packaging is indeed messy the needs of traditional linux distributions are by far the least important. Python packaging needs to better serve the needs of python developers, not those of sysadmins.

Having a "linux distro" interpose itself between you and your libraries is a fundamentally broken system not worth fixing. All decisions related to dependency choices fundamentally belongs with upstream.

cjfd · on Nov 16, 2021

Perhaps the distributions would be more inclined to include up-to-date version if the standard in the python community was not to break everything all the time.

There are distributions that keep up-to-date, though, e.g., archlinux.

Serving the needs of python developers vs. sysadmins is a false dichotomy. Python developers develop on a system that they need to admin. One great thing about linux is that everything on a system can be kept up-to-date using just one software tool (a package manager). You are not going to convince me in a million years that this is a bad idea.

Now, for things that are needed in development you sometimes need different versions, in particular if you happen to have an OS that ships very old versions. For that purpose there should be some sort of tool and not a gazillion tools that are all incompatible and behave in slightly (or not so slightly) different ways. 'python -m venv' is different from 'virtualenv'? WTF?

Also, if your distribution is up-to-date (e.g., archlinux) and you are pinning to older version there is also a problem. As the article puts it 'pin their dependencies to 10 versions and 6 vulnerabilities ago'. And if you actually want to maintain your software in the future you, at some point, have to go to the new version anyway.

simonh · on Nov 16, 2021

>Serving the needs of python developers vs. sysadmins is a false dichotomy. Python developers develop on a system that they need to admin.

I couldn't disagree more. Even if the same person is admining a system they develop on, they almost certainly aren't going to admin the systems their users deploy on. The admin role and developer role should be completely separate with different goals and requirements.

The system Python should only be used for system scripts. The end. Nowadays ideally it shouldn't even be visible to non-admin user accounts, and users certainly shouldn't be installing packages into the python being used by system scripts, not even using virtual envs.

The python your system uses and how it's configured should be decided by the distro. If they want to break it up into weird packages and funky paths, whatever. That's their problem because they and maybe (_maybe_) system admins should be the only people using it.

As a developer you should be making your own decision about which version of Python you are using, what modules get installed into it, what venvs you have and how you manage them. This should all be considered with one eye firmly fixed on the target deployment environment and how your Python will be configured there. The right answer, of course, being that you should be packaging the required python along with the application.

orev · on Nov 16, 2021

This is exactly the approach RHEL has taken with version 8. Typing ‘python’ results in ‘command not found’. All the system tools that use python are set to use a custom path dedicated for those tools.

rolandog · on Nov 16, 2021

Interesting; how would you isolate the system python from the user in Debian/Ubuntu. Do you have a link where I could learn more about this?

simonh · on Nov 16, 2021

You may not be able to completely, but basically if users can write python scripts and run them using system python that's a side effect. It's purpose is to run system scripts, and any other use must not interfere or compete with that in any way.

To be hones that wasn't always the case. In the early days Python was provided on these systems for users just as much as administrators and in the early days there weren't any system script as part of the OS that used it. However now there are, and the packaging and configuration of the system python has in some cases even been mangled somewhat to suit the needs of the distro. It's time to make a clean split between system python and user/developer python.

dralley · on Nov 16, 2021

RHEL / CentOS 8 did this a few years ago. System scripts use a separate Python binary.

rolandog · on Nov 26, 2021

Ahh, I see! Thanks for the clarification!

pas · on Nov 16, 2021

Don't put the system python in $PATH. Any command that needs the system python can be launched through a shim/wrapper script that sets the PATH. (Or even better to execute python directly to avoid passing the system python in PATH.)

rolandog · on Nov 26, 2021

Wow. It never occurred to me to approach it that way, thanks!

dagw · on Nov 16, 2021

I would start by putting it in a path that won't be included in the users default $PATH, perhaps /sbin or /usr/sbin

rolandog · on Nov 26, 2021

Ooh, this sounds quite nice. And it would also be compliant with the Filesystem Hierarchy Standard.

ajsnigrutin · on Nov 16, 2021

> Perhaps the distributions would be more inclined to include up-to-date version if the standard in the python community was not to break everything all the time.

Yep... Perl is a godsend... take a code from 20 years ago, run it on a modern system, and everything works.

Python? Three different software versions need three different versions of the same library, and new library versions are not backwards compatible with their older versions... Even Python 2.x -> 3.x was a pain in the ass too, and even within minor versions, you sometimes get breaking changes.

toyg · on Nov 16, 2021

> Perl is a godsend... take a code from 20 years ago, run it on a modern system, and everything works.

That's more or less like "take a VB6 binary from 20 years ago, run it on some modern Windows, and everything works" - that's just because the ecosystem is effectively dead, so supporting it on new releases just means carrying over some stuff that worked 20 years ago.

ajsnigrutin · on Nov 16, 2021

But it works... it's universally supported on pretty much every os, even by default on most *nix based ones, and new versions and new CPAN modules are still written.

In 10 years, with python 4, or even maybe 5, everything will still be broken, and you'll still be claiming perl is dead, and i'll still be using the same stuff I use now, that worked years ago, works now, and will work then.

toyg · on Nov 16, 2021

And people will still be shipping code faster and more efficiently than they would do in Perl. Because in the end, the advantages of using Python, in terms of readability and productivity, easily offset a bit of packaging pain - whereas the disadvantages of using Perl don't offset whatever marginal gain you get by using old infrastructure.

bmj · on Nov 16, 2021

And people will still be shipping code faster and more efficiently than they would do in Perl.

I feel that quite a bit of software written these days is designed to have a short shelf life. I wonder how many of the "Show HN" posts will work, or even be useful, in five years? How much will be maintainable?

I've been thinking a lot about Matthew Crawford's writings on mechanical things[0] and how they might apply to my own craft as a software developer. The work, as described generally on HN, is still all about moving fast and breaking things. It doesn't matter if the Python code I crank out today works in five years (or is maintainable in five years) -- I just need to get my product to market. Some of us work in other domains where sustainability and repair-ability are important: if my software is going to be in the field, in users' hands, for ten years, I need to consider how reasonable it might be fix bugs. If I have to fight just to get the software to run, I've already lost that battle. In these cases, the dependability of something like Perl is really great. The complaint against Python here isn't about the language itself (I think many people would agree it is quite nice to use), but rather the larger ecosystem, which makes it very hard to maintain software over the long haul.

[0] See Shop Class as Soulcraft, The World Beyond Your Head, and Why We Drive.

ajsnigrutin · on Nov 16, 2021

> I feel that quite a bit of software written these days is designed to have a short shelf life. I wonder how many of the "Show HN" posts will work, or even be useful, in five years? How much will be maintainable?

Yep, all the python2 code from 5 years ago doesn't work anymore... It's sad how getting python2 to work on newish distros is becoming a great pain in the ass.

pjz · on Nov 16, 2021

HN has been around for at least 5 years, right? Probably some interesting stats on how many 'Show HN's are live, dead or decomposed. Live: updated in the last year Dead: not updated in the last year, but still available. Decomposed: not findable/available.

ajsnigrutin · on Nov 16, 2021

But that code won't work in five years, because all the distros will remove python3 in favour of python4. (also, the code from 5 years ago written for python 2.x doesn't work now, becauses distros removed python2)

Do you really want to rewrite all your stuff every 5 years?

kymaz · on Nov 16, 2021

As an end user, I care more about code quality and stability than I do speed of shipping code. I don't want to use some buggy code that was pushed through production too quickly just because the devs are lazy!

ajsnigrutin · on Nov 16, 2021

Yep... if a thing works, don't fix it.... Modern devs "fix it" until it's broken. Just look at google and their communication platforms... every few years a new one that kills off the previous one, with practically no added value.

There is a lot of software, from 7zip, total commander, putty, vlc, windirstat, etc., that do one (or few things), people have been using them for decades, and pretty much all the features have been there for that long, without an artificial need to "ship something new, fast" every few weeks.

katbyte · on Nov 16, 2021

Or instead use another modern language? I avoid any python that I can’t apt-get/brew and haven’t considered writing anything with it for a very long time. There is always go/ruby/c# which seem to generally avoid the problems discussed here

tyingq · on Nov 16, 2021

>that's just because the ecosystem is effectively dead

Perl may be "dead", but that's not why it's so backwards compatible.

znpy · on Nov 16, 2021

What's the sin in supporting stuff that works?

ajsnigrutin · on Nov 16, 2021

This is the sad reality of modern development... new project? Why take some stable tech, when you can take a 2month old framework and an alpha version of a library, and do two rewrites, before the project fails due to breaking changes and abandoned software. It seems as if people actively avoid anything stable.

popcube · on Nov 16, 2021

they still publish new main version of perl every year, and still support code from perl4.

drran · on Nov 16, 2021

Your comment is not updated for 20 minutes, so it's effectively dead. Please, write a new one.

toyg · on Nov 16, 2021

In 20 years when we'll all be using brainfuck, my comment will indeed be effectively dead.

benibela · on Nov 16, 2021

It is because Windows has a stable API and still has all the old functions

wheelerof4te · on Nov 18, 2021

Yes, Win32 API is ancient. No, it does not mean it's useless.

jacquesm · on Nov 16, 2021

So, Perl is the new Latin?

thesuperbigfrog · on Nov 16, 2021

It can be:

  use Lingua::Romana::Perligata;
 
  adnota Illud Cribrum Eratothenis
 
  maximum tum val inquementum tum biguttam tum stadium egresso scribe.
  da meo maximo vestibulo perlegementum.
 
  maximum comementum tum novumversum egresso scribe.
  meis listis conscribementa II tum maximum da.
  dum damentum nexto listis decapitamentum fac
      sic
          lista sic hoc tum nextum recidementum cis vannementa listis da.
          dictum sic deinde cis tum biguttam tum stadium tum cum nextum
          comementum tum novumversum scribe egresso.
      cis

Source: https://metacpan.org/dist/Lingua-Romana-Perligata/view/lib/L...

ajsnigrutin · on Nov 16, 2021

It's funny, because original version of this library was written in year 2000, so literally more than 20 years ago (my example of "old") :)

https://metacpan.org/dist/Lingua-Romana-Perligata/changes

pjz · on Nov 16, 2021

Perl 4, 5, or 6? What's CPAN's dependency resolution like?

nieve · on Nov 17, 2021

That's a really weak gotcha: Perl 4 was deprecated 28 years ago and Perl 6 is a completely different language that's been called Raku since 2019. There are no dependencies to resolve between them. It's like worry about dependency resolution between Javascript and Java. They're not even the same platform.

lizmat · on Nov 17, 2021

True. But Raku (https://raku.org #rakulang) does have an Inline::Perl5 module, which allows you to use 99.9% of modules on CPAN (basically, only the ones using source-filters, and ones that are really, really deep into the Perl internals). So there *can* be dependencies between Perl and Raku.

BiteCode_dev · on Nov 16, 2021

Sure, so my lib works on mac, windows, centos and debian.

Tell me how I'm supposed to define my lib python deps ?

How am I suppose to package the lib ? Distribute it ? Deal with os versions differences? Allow isolation for my user projects ?

1337shadow · on Nov 16, 2021

I'm sorry, doesn't pip already works there? Otherwise, there's pyinstaller which is great but it requires an entrypoint so it won't do with a pure lib that doesn't expose any script at all.

katbyte · on Nov 16, 2021

I have apt/brew - I don’t want to install yet another package manager. Anytime I see just use pip I just walk away and find something else.

BiteCode_dev · on Nov 16, 2021

Yes that's the point. We won't use distro packages because only regular python packages allow this.

Edit : arf, sorry, I answered the wrong comment, I actually agree with my parent.

1337shadow · on Nov 16, 2021

All right, sorry I didn't get it (I'm non native)

agumonkey · on Nov 16, 2021

Instead of assigning blame we should interpret the tension as a sign of an important need having no good solution. It's quite challenging to overlay packaging/versionning systems .. but theoretically it's an interesting and general question. Maybe people should model and solve the whole issue (if there's some solution at all; which I hope)

samwillis · on Nov 16, 2021

The fundamental problem is the difference between Curated and Uncurated packaging systems:

- Linux dictros have curated packaging, often outdated but stable.

- PyPi, NPM, and others are Uncurated, anyone can publish, could be unstable, could be insecure.

It's down to the developer to decide which route they want to take, and at the moment most want to move quickly with the latest tools. To do that you have to go the uncurated route. It only becomes an issue for a developer if their software is published by a Linux package management system, but 99.9% of developers will never have that.

rvdginste · on Nov 16, 2021

> The fundamental problem is the difference between Curated and Uncurated packaging systems

Exactly this! It is expected from linux distros that they have curated packaging. I think that is good and I really expect it to stay that way.

Whether you want curated or uncurated packages depends on the use case.

As the user of a program, I definitely want curated packages.

As the developer of a program, I want to specify myself which version of a dependency I want to develop against and I don't want to be hindered by the linux distro in doing that. I do think that developers in that context are not always supported that well in linux distros. And on the other hand, I do think that tools supported by the programming language can assist in that scenario. (for example installing multiple versions of the compiler and runtime in the users home directory and being able to easily switch between those versions on a project basis...)

rollcat · on Nov 16, 2021

> Python packaging needs to better serve the needs of python developers, not those of sysadmins.

Please substitute s/sysadmins/users/, and realise your developers are users too.

I've been doing Python for almost 15 years; and I'm getting really fed up with some things. Packaging is a mess. Distribution is a mess (for servers/IoT - Docker saves the day; for desktop - I feel like giving up). Managing the installations is a mess; upgrades can be impossible - I'm hard-stuck on Python3.6 on one project!

I find myself rewriting many smaller tools in Go or Rust, just because I can upgrade the toolchain at any time, and/or ship a static binary. But Rust has a very high barrier to entry, and Go tends to be simplistic.

I'd fully jump ship today, but Python has just too much momentum behind it.

ShamelessC · on Nov 16, 2021

I agree to some extent, but I believe they were mostly trying to convey that much of the work is done by volunteers who are largely only motivated/have the resources to test a small subset of potential deployment scenarios. Those contributions are still valuable and contributors themselves don't owe you anything. The best way to fix this issue would be to get involved (or switch languages as you suggested, that's fine too).

turminal · on Nov 16, 2021

Language specific package managers are the antithesis of package management. There is no "management" in "install multiple versions of stuff in this 10 levels deep dependency tree". Upstreams can pick their dependencies but at the same time cannot control what packages depend on them and make it into your app and neither can you. That's a job for your distro.

fnord123 · on Nov 16, 2021

There are different cadences and roles in play here.

Cadences:

Distribution - e.g. Ubuntu 21.10, 22.04; RHEL 8.4, 8.5, ..., 9

Library - e.g. simplejson 3.17.4, 3.17.5, ..., 3.18.0

Language - e.g. Python 3.8.0, 3.9.0, 3.10.0 (This makes Python particularly annoying because for a C project it shouldn't matter if you used gcc 8 or gcc 9 when producing binaries, but with Python it very much matters which version of Python you run with)

Application - e.g. Gimp 2.8.22, ..., 2.9.0

Roles:

As a sysadmin you want to use the distro package manager to install tools for administering your system so that you can connect to wifi, monitor resources, etc.

As an application developer you want to talk to the language specific package manager so that you can use the latest versions of libraries so that bugs are fixed, and you're not shackled to people using RHEL 6 or 7.

As a Debian/Fedora packager you want to talk to the distro because that is required if you want to submit an application upstream to support users who want everything in the distro package manager.

As an application packager you may want to talk to the distro because of the above, but you could also target flatpak (or snap) so that you can use all the latest libraries without worrying about packaging for slow moving distributions.

As an end user you want to use the distro package manager because that's your embedded mental model and workflow. But you should definitely consider using flatpak (or snap if that floats your boat) so that you can use the release stream wherein the libraries are unpinned from the underlying distribution's cadence.

As an end user you should never need to deal with the language specific package manager to install applications or libraries. If you need to install using the language package manager (npm, gems, rocks, cargo, pip) then the application has not really been 'distributed' or 'packaged' imo. If you're doing this, you're off-piste trying out new stuff.

There are more details and a great conversation that can flow from this, but language specific package managers are not the antithesis of package management and supplement it; just not for the 'end user' role. Well, yes for the 'end user' role, just not directly.

hulitu · on Nov 16, 2021

Wouldn't be better if the python programmers did a bit of planning ? Or are the things so boring that every python minor version must be incompatible with other python minor versions.

fnord123 · on Nov 17, 2021

Features get added. So you can't use a match statement in 3.9 but you can in 3.10. If you write a library you need to be conservative in which features you use or applications can't use you.

Why not just upgrade applications ASAP? Sometimes things are removed over three or so versions. So you can't move from e.g. 3.9 to 3.10 without making sure you're importing types from the correct location:

https://docs.python.org/release/3.9.0/whatsnew/3.9.html

> Aliases to Abstract Base Classes in the collections module, like collections.Mapping alias to collections.abc.Mapping, are kept for one last release for backward compatibility. They will be removed from Python 3.10.

These things are planned. It's just much much harder to handle this because you need the interpreter at runtime while in compiled languages you just use `-std=C99` and you can compile the code and then link.

bregma · on Nov 16, 2021

This is the most insightful comment.

All the other observations on this topic (and I've read very many over the years) are akin to the six blind men and the elephant.

pfranz · on Nov 16, 2021

What happens if your project needs an update, but it will hose the OS? Do you expect someone working in Python in Windows who wants to distribute a datetime package should package for RPM and APT?

I used to think OS package managers should be the end-all be-all, but the use cases for OS package managers are very different from language runtimes. While different Linux distros were fighting between themselves, they completely ignored use cases for projects and language runtimes. Sadly, the result is a mess for everyone.

I think people need to figure out what approach works best for them. I'm of the opinion that any core technology to your business needs to be decoupled from the OS. It makes OS updates too messy and you're tethered to whatever the OS supports. My field uses a lot of Python and every company figured out quickly they need to run their own binaries in addition to packages.

At one company, they packaged up custom RPMs. It wouldn't be a problem to package up and distribute Python libraries. Others had their own package system (no OS or runtime fit their needs). It seems like most people use something like virtualenv.

Regrettably, this means there's no easy answer for people new to Python and the right choice will probably change as you grow. But I really think the answer is Python should come up with something that works for Python and let OSes do their thing.

turminal · on Nov 16, 2021

> What happens if your project needs an update, but it will hose the OS?

Whay do you mean by "needs an update"?

> Do you expect someone working in Python in Windows who wants to distribute a datetime package should package for RPM and APT?

Absolutely not. I expect a serious user of that library on an apt based system to package it and submit the package to their distro.

pfranz · on Nov 16, 2021

This was 10+ years ago, but I remember something like the installed Python package had a caching bug that was affecting production. The update wasn't compatible with some OS scripts so the machine would no longer boot using the updated package. Problems like that came up fairly often, but I remember that was the most egregious.

Another scenario is that the OS shipped with Python 2.5 (supported until May 2011), but we had third-party tools that required Python 2.7 (shipped July 2010). Switching OSes (where things like monitor or hardware drivers weren't yet supported) was a ridiculous pain to test and certify. Decoupling OS and Python+package versioning was a huge relief for everyone, but won't make sense for everyone.

> Absolutely not. I expect a serious user of that library on an apt based system to package it and submit the package to their distro.

I try my best to personally do this and push for a work culture that does this, but even if this was done I can't fathom waiting on an OS update for existing code to percolate down. The risk tolerance, scope of concern, and agility between an OS and whatever project pays the bills are very different.

cozzyd · on Nov 16, 2021

One of the annoying things is python setup.py bdist_rpm seems essentially deprecated (though it still works somewhat).

Denvercoder9 · on Nov 16, 2021

> All decisions related to dependency choices fundamentally belongs with upstream.

No. As a user I want dependency management (and all of software distribution, to be honest) to be handled by the party that's best able to keep things working while at the same time keeping them secure. Linux distributions have a much, much better track record at that than most upstreams.

Gigachad · on Nov 16, 2021

I really doubt that the python libraries packaged by Debian are any more secure or stable than the latest release of those libraries. At best they just limit to breaking updates to once every few years when they update them.

It’s essentially like version locking packages except some random Debian maintainer decides when it’s time to update.

Denvercoder9 · on Nov 16, 2021

> I really doubt that the python libraries packaged by Debian are any more secure or stable than the latest release of those libraries.

They are more stable because I can keep using the same version for two years, and I'm not being pushed to the latest version that has (intentional or unintentional) breaking changes every two months. Yes, there might be a bug or two in there that have since been fixed, but I very much prefer the failure I know over unexpected failures.

They are secure because Debian (and distros like it) backport security fixes to their packages. You can argue about whether they do a good enough job keeping up with vulnerabilities, but at least I know that once I install the update from Debian, my machine is secure, and I don't have to wait for the upstream authors of all software on my machine to release updates that upgrade their dependency.

> It’s essentially like version locking packages except some random Debian maintainer decides when it’s time to update.

Yes, but version locking isn't my problem. The crucial difference is that distros pick a version and support those for years, while upstreams usually force you to use the latest version all the time to get security support. With distros _I_ get to decide when I upgrade, and the reduced frequency is a nice bonus. Having a single entity for all software on the system is also valuable, as there's just one tool to learn and one place to check for updates.

goodpoint · on Nov 16, 2021

The python libraries packaged by Debian are provably more secure or stable: they receive security backports while also not receiving feature updates.

The combination of this 2 aspects is what provides better stability and better security.

> It’s essentially like version locking packages except some random Debian maintainer decides when it’s time to update.

Not at all.

maple3142 · on Nov 16, 2021

I am not sure if Debian can really keep give packages security updates without feature updates. For example, Debian packaged Chromium seems really outdated and having many unpatched CVEs: https://security-tracker.debian.org/tracker/source-package/c...

goodpoint · on Nov 16, 2021

...and it's being removed for that reason.

plorkyeran · on Nov 16, 2021

That is not "provably more secure or stable". I think it's pretty safe to assume that maintainers backporting security fixes is more secure than just not updating at all, but even that isn't proven. It being more secure than updating is much more questionable, and is probably going to vary greatly between packages.

goodpoint · on Nov 16, 2021

Citation needed.

Not only Debian, but Red Hat, Suse, Canonical and others provide both free and paid security updates and many large companies are happy to pay quite a pretty penny for that.

regularfry · on Nov 16, 2021

Yes, exactly this. Only I view it from the other way round: to try to do development with distro-supplied language packages is a category error.

Distro-supplied interpreters and their associated libraries are there for the applications supplied and supported by the distribution. Unless you are developing something to be part of the distro, they are not for you.

Do not let the distro get between you and your libraries. Supply your own.

Pokepokalypse · on Nov 16, 2021

>Unless you are developing something to be part of the distro, they are not for you.

. . . forgetting the original purpose and mindset behind Linux in the first place. From a 1990's point of view; it was always intended to be a hobbyist OS; and in most cases, one had to compile all the binaries and kernel one's self.

There was no such thing, really, as a "user" or "admin" - everyone was considered, and expected to be a developer.

regularfry · on Nov 16, 2021

Yes, and distributions arose to solve the problem that plagued early Linux: how to have a set of applications and libraries with consistent and mutually compatible versions of everything. If you're compiling everything yourself, building it so it all works once is an achievement. Keeping on top of as many moving targets as there are binaries in the system so that everything keeps working over time is not practical once the number of moving parts gets high enough.

Distros solve a real problem, but the trade-off is that some parts of the system must exist to serve itself.

goodpoint · on Nov 16, 2021

> Python libraries shipped by distributions are so old that this mechanism is mostly useless for python development.

Absolutely not. Most distributions ship versions that are released before the distro release freeze.

The whole point of using a distro is to have a reliable and trusted development and production platform.

drran · on Nov 16, 2021

> Python packaging needs to better serve the needs of python developers, not those of sysadmins.

Use Fedora, or Arch, or another developer-oriented distro.

dgan · on Nov 16, 2021

Exactly agreed on that. I don't understand what value distribution is providing by repackaging python libs, they re always way too old to be usable, and they re global while I work on many projects, with their own incompatible requirements. Maybe I am dumb, but I exclusively use virtualenv and pip..

rlpb · on Nov 16, 2021

> I don't understand what value distribution is providing by repackaging python libs...

I want to easily and safely use some app my distribution ships. I want to receive security updates automatically for all such apps. I don't care what language it's written in or what its dependencies are.

These app packages provided by the distribution have dependencies that are also packaged by the distribution so that dependency resolution works.

Since the point of a distribution is that it can run apps, the value is that a distribution works at all.

dgan · on Nov 16, 2021

Valid remark. I had "development side" point of view, not "usage"

avidphantasm · on Nov 16, 2021

I agree with this for pure Python libraries. However, as soon as you get into things that bind to C libraries (Numpy, GDAL, etc.) it quickly becomes much easier to use the package from your distro.

mangecoeur · on Nov 16, 2021

Everyone complained, no one addressed the fact that for years the whole of python packaging was handled by like 2.5 people.

But as a rant this, like most of the ‘but just fix it’ rants, fails to acknowledge the hugely diverging needs of different users. I could not live without conda, since it’s the only sane way to get a working recent geospatial stack. Others need to run embedded environments, or portable ones, some need long term stability while others need bleeding edge packages that haven’t been released yet. Solving for a single case is straightforward; solving for all of them , not so much.

roenxi · on Nov 16, 2021

> Everyone complained, no one addressed the fact that for years the whole of python packaging was handled by like 2.5 people.

If I can make an observation - it has nothing to do with the number of maintainers. The problem is deep, cultural and occupies a difficult space where it might be a bug or a feature.

The root cause here is that the Python project, and surrounding community, have little real respect for backwards compatibility. The complexity of Python setups is driven by the need to run multiple - potentially even mutually incompatible in the case of 2.x v. 3.x - versions of Python.

All languages have packaging problems, but Python is unique in my experience in the sheer number of Python installs that I need to manage simultaneously. I still have C code that works from around the time that I learned C. I'd need another Python environment installed to say the same thing about Python.

allen_lasn · on Nov 16, 2021

Python 2 was supported for 20 years. Python 2.7 (largely compatible with other Python 2 versions - and including backported changes from Python 3.1 - was supported for 12 years).

The Python project went absolutely above and beyond to support users who wanted to drag out making not particularly complex changes to their codebase for over a decade.

During this same time they improved Python 3 in response to feedback and among other effects, made it less different and easier to port, from Python 2.

Many common packages supported 2.7 up to its end-of-life as well.

One of the reasons that Python is often a source of compatibility errors is that both distros and large standalone applications embraced Python in the early 2000s, became dependent on a particular version, and then refused to work with newer versions.

Python is not responsible for all the engineering decisions everyone writing in the language has ever made.

roenxi · on Nov 17, 2021

Yes. That is what a culture of not supporting backwards compatibility looks like. If they valued backwards compatibility they wouldn't have to support multiple versions of Python over long stretches of time. They would be supporting 1 version that was backwards compatible with code written over 20 years ago.

mangecoeur · on Nov 16, 2021

It really does have to do with the number of maintainers. A huge part of the work is not just building a package manager, but coordinating many stakeholders to all use it. That would also mean building a package manager that covers all of those use cases from day one - what's the point of trying to make people use something that doesn't meet their needs. These things are a huge amount of work. Unsurprisingly, other widely used languages have the same situation - dozens of build and packaging systems for C/C++, Java, Javascript...

microtonal · on Nov 16, 2021

And yet, other languages ship package managers [1] that are widely loved within their ecosystems and cover everything from microcontrollers to server applications.

I think what it makes it more difficult in the case of Python is that it has decades of legacy to deal with. No consistent semantic versioning, packages that expect that they can modify their package path in-place (this is a nightmare for immutable systems, like NixOS or OSTree-based system like Silverblue), a wide variety of build systems that sometimes hook into make, etc. Solving this is a hard problem.

This is why an authority like PSF has to step in and say: this is how it is going to be done from here onwards.

[1] E.g. Rust's Cargo.

uranusjr · on Nov 16, 2021

Rust has and “advantage” here that it’s not generally shipped with your distro’s package manager. I think my biggest problem with this article is that the distros put Python there in the first place, and all of them apply patches to make Python work how they want it. And when it doesn’t work, it’s Python’s fault…? I mean yeah Python can definitely do something to make distributing Python easier, but it can only do so much without distros’ direct involvement.

I’ll add that most of Linux distro packaging contributors are generally very nice people, understand the problem at hand, and are very open to collaboration. But sometimes you see this kind of “it’s all your fault” complaints and it’s doing exactly the opposite of helping the cause.

drran · on Nov 16, 2021

  Available Packages
  Name         : rust
  Version      : 1.56.1
  Release      : 1.fc34

turminal · on Nov 16, 2021

Cargo is a good example of a survivorship bias of sorts. It's so deeply integrated into rust that if you can't stomach it you just leave the ecosystem and everyone that remains likes it ;)

pdimitar · on Nov 17, 2021

Hm, so what did you dislike about `cargo`?

__float · on Nov 16, 2021

Is it fair to compare an interpreted language and its package manager to Rust and Cargo? Python packages ship their source (in most cases), depend on a locally installed interpreter (with semantics possibly changing by version).

Yes, Python packages often make poor assumptions about what setup.py can do (i.e., _anything_), and so you end up choosing between "tested, supported by the author, and old" or "untested, unsupported, but up to date".

wildbook · on Nov 16, 2021

Rust almost does the exact same thing, so I'd say it's fair. Dependencies (crates) are grabbed in source format and compiled locally as part of the build process, and installing Rust programs through Cargo also compiles them (and their dependencies) locally.

Some crates have the same issue where build scripts rely on outside tooling being installed, but it's definitely not common to (unless you're relying on compiling C/C++ code for FFI for example, in which case it's somewhat frequent).

__float · on Nov 16, 2021

I think there needs to be a distinction between fetching crates as source at _build time_ and what happens with Python. Python's "build time" will still require the source to be present on the target machine it's deployed to -- unfortunately these tools are complex because "packaging" is only half of the issue, it's also the distribution and _deployment_ which makes things messy.

Consider that, even if you want to use packages only from your application's virtualenv, the default (footgun warning!) is that Python will still use the "system" packages -- this means you may have installed Ansible or some other tool that relies on Python and many packages from the distro package manager. But your app could pick up one of those dependencies! At best, this will work fine. But in the worse cases, perhaps it subtly behaves differently or simply does not work at all.

My understanding is that Rust will, by default, statically link all of these dependencies. This, in Python, would be like a "pex" or "par" (or one of the many other options :^)), which does make the distribution aspect much simpler. (At the cost of build-time complexity, slowdowns, and occasional incompatibility.)

regularfry · on Nov 16, 2021

Cargo started as a port of Ruby's Bundler. If Ruby can do it, Python certainly can.

bhaak · on Nov 17, 2021

Don't underestimate the culture differences between the Ruby and the Python communities.

With Ruby, you are expected to update frequently. New ruby versions are eagerly awaited and all the major packages are updated pretty quickly.

regularfry · on Nov 17, 2021

Python is just a more fractured community, for reasons that have never really been that clear to me.

But it doesn't harm the point I was getting at, which is that being a dynamic language doesn't give Python an excuse compared to Rust in itself when there are highly successful adjacent examples. Obviously there are other factors that come into play, or it would be ancient history by now.

mangecoeur · on Nov 16, 2021

Worth noting that the PSF has no authority to tell people how to do their packaging.

Rust's Cargo is 20 years newer than Python and benefits from those decades of experience. It's a very different proposition to start a new system from fresh than to try to migrate a huge and diverse community towards it.

regularfry · on Nov 16, 2021

This is actually where a BDFL should step in. "We're doing X for Python 5, we're using it to manage the stdlib as well as community packages, everyone get ready because I have decided."

Macha · on Nov 17, 2021

They tried this with pip.

Distros ripped it out.

They added the ensurepip module so users could add it after the fact.

Distros ripped it out too.

Arguably the advantage newer languages like Rust and Go have in this regard is they don't even consider the distro use case - you're going to get static linking and you'd better like it. Whereas Python is from an older era and tries to fit in with the local customs and so gets hammered for its inconsistency

regularfry · on Nov 17, 2021

Yes, and that's fine. Distro python isn't for end users.

thetwentyone · on Nov 16, 2021

And Julia's Pkg! One of my primary reasons for switching away from Python to Julia was how ergonomic modern dependency management was.

komuher · on Nov 16, 2021

Kinda diffrent Rust its like 3 years old when it start getting popular. Python is 13+ years old when it starts getting very popular. You cannot compare legacy to modern solution -.-

ajsnigrutin · on Nov 16, 2021

Perl is 33 years old, and perl 5.x is 27 years old.

I can literally take a 20+yo book and all the examples still work. CPAN still works. I literally have 20+yo scripts still copied from server to server, from laptop to laptop, without any changes.

smallerfish · on Nov 16, 2021

Maven was launched 9 years after Java got popular. Everybody was using Ant, everybody decided Ant sucked and moved to the new solution. While Gradle is the new kid on the block, it keeps the infrastructure of dependency management and is thus compatible with Maven to that extent. It can be done.

RegW · on Nov 16, 2021

Ant didn't do package management.

smallerfish · on Nov 16, 2021

No, but ivy was commonly used. That said, the bigger point is that the standard way to do things can be changed. For some reason, despite (or because of?) PEP, the python community seems unable to coalesce around standard ways of packaging.

RegW · on Nov 16, 2021

Perhaps I have lived too long out in the provences, but Maven was my first experience of dependency management in Java. After Ant it was a no-brainer because Ant didn't do dependencies. I didn't come across references to Ivy for some years and personally I have never seen it used in the wild.

That said, I didn't actually like early Maven that much, it was pretty inflexible and often required the AntRun plugin to do something novel. However, it is still top dog everywhere I work and very much the standard way to do things.

microtonal · on Nov 16, 2021

Which is exactly what my second paragraph said :). This is a common struggle for many older programming language ecosystems, e.g. I think the same is true to some extend for C and C++.

As one of the sibling commenters mentions, there are good examples that it is possible to standardize packaging better. Maven replaced IDE-driven builds and Ant for in the Java ecosystem and added proper package management. Additionally, it required that projects start conforming to standardized layouts, by taking convention over configuration and being largely declarative. I think the Maven success story lost some of its shine with Gradle, but that's another story.

WhyNotHugo · on Nov 16, 2021

Which is why I'm starting to feel like Python is the new Java.

oneweekwonder · on Nov 16, 2021

Strange, because python was released almost 5 years before java.

mangecoeur · on Nov 16, 2021

Python is the Old Java :P

l0b0 · on Nov 17, 2021

I'm not saying it solves every problem for everyone, but have you looked at Nix? I've found it to be great for portability (at least on other Linux distros), and you can easily install anything from stable nixpkgs packages to Git repos. I'm currently using it to manage OS packages and development dependencies for some of the projects I work with (a couple at Work™).

fer · on Nov 16, 2021

This is my decision flow and I rarely have an issue:

    Are you the end user of the Python code?
        Yes -> Is available in your distro?
            Yes->Use package manager
            No->Use pip install in user mode
        No -> Create virtualenv with the Python version you want (including pypy!) and do your pip thing there

Some extreme use cases may benefit from anaconda, but personally I've never needed to use it. My only pain point is dealing with legacy code that relies on PYTHONPATH. Nothing good ever starts setting PYTHONPATH.

nerdponx · on Nov 16, 2021

> Create virtualenv with the Python version you want (including pypy!) and do your pip thing there

You might benefit from using Pipx in this case: https://pypa.github.io/pipx/

Pipx is good for the case of "I want to run a standalone Python application that is available through Pip, but not my system's package repo." This is a more common case than you might think.

It's a sensible alternative to `pip install --user`, and having self-contained deps for tool is a bit like `npm install --global` or even `volta install`.

WhyNotHugo · on Nov 16, 2021

Yup, this kinda works.

It doesn't address the greater issue tho: that it's getting harder and harder for distributions to package things right, and provider packages for their users (evidenced by the fact that you need a second package manager just for python stuff).

nerdponx · on Nov 16, 2021

How is Pip any different than Cpanm, NPM, Gem, Luarocks, Nimble, Go's thing, Cargo, whatever JVM people use, whatever Haskell people use, etc. in that regard?

Distros have a hard job, but at the same time programming language tooling devs have more "customers" than just distro maintainers.

WhyNotHugo · on Nov 16, 2021

> Use pip install in user mode

This is a great recipe for disaster. Whatever you install in user mode will shadow anything installed system-wide, so when you try to run some system-wide project, it may now fail. I'm also not a fan of how it drops scripts into `./.local/bin`, since that's where I keep my own script, and is version controlled.

The installation will also be frozen and never get updated -- unless you remember to do it manually.

Finally, and worst of all, this leaves you in a dead end if your packages have conflicting dependencies, which is too often the case in Python-land.

pfranz · on Nov 16, 2021

So you're suggesting always using virtualenv?

I used to just use pip to install to the system. Months/years later I would try to untangle the mess of packages I was just playing with, what the OS wanted/needed, I got those conflicting dependencies you mention, etc. I usually ended up reinstalling the OS. At the time I may not have been as knowledgeable about where the OS package manager keeps packages vs pip--but the whole thing wasn't very user-friendly either.

For years I've been installing into user knowing I can just blow it away. I've dabbled with virtualenv, but it's such a pain to set up and activate. If I have a few projects with similar libraries it's more of a pain to set them all up and switch around. If I end up using a script for something important, I just spend the extra time at that point to "package" it.

pwang · on Nov 16, 2021

This is one of the reasons people use Anaconda/miniconda for non-data science work: conda environments are self-contained Python installs, so if you conda/pip install packages into those environments, they will not break each other. This design requirement arose from the specific needs of numerical computing (which always drags in a ton of system-level C/C++/FORTRAN dependencies), but is a generically useful design construct.

Anaconda is a distro, and conda is a package manager, that works across OS platforms and hardware architectures, and installs cleanly into userland without requiring admin privileges. The only way we achieve this difficult goal is by creating a distro and build system that creates "portable" packages that can be relocated/relinked at install-time.

Ultimately, Python's challenges in this department come from the fact that it has such great integration with low-level C/C++ libraries. This gives it super powers as duct tape/glue language, but it also drags it down into the packaging tech debt of C/C++. Hmm... maybe I should write that blog post: "Python Packaging Isn't The Problem; C/C++ Is." :-)

cturner · on Nov 16, 2021

I was slow to get to grips with venv. It sounds like you are on the same path. This note tries to be constructive advice -

* Some distro software uses python. Let the package manager take care of dependencies for that.

* For everything else, use an dedicated virtualenv for each codebase you are working with.

   > I used to just use pip to install to the system

Never do this, for the reasons you cite.

   > I've dabbled with virtualenv, but it's such a pain to set up and activate

Setup for virtualenv: "python3 -B -m venv venv". Have a shell alias 'alias v=". venv/bin/activate"' that allows you to activate it if you need to install libraries or access a shell. "pip install blah" for library install. That should be all you need.

   > If I have a few projects with similar libraries
   > it's more of a pain to set them all up and switch around

Have a think about why you feel this way, and whether you could mitigate the problems.

Here is what I do. Once my libraries are installed for the current project, I rarely activate venv in the current shell. Rather, for each python project, I have a bash script "app" in the root of the project, and a dedicated "venv" directory.

The app script does the following: (1) sources the local venv; (2) does pip freeze > requirements.txt to capture any dependency changes; (3) launches the project. Often I will have multiple launchers in that script, with all of them commented out except for the active one. Be in a habit of always launching from that app script.

To reiterate the approach above, whenever you sit down to write some python code, ensure that you have a dedicated venv for it, and that you are only ever launching code from that local venv.

I have spoken to developers who get upset at the extra hard disk overhead. You don't need to optimise for hard disk usage. Hard disk space is almost free.

I don't bother creating setup.py files, except for the odd occasion that I want to publish code to pip. Good luck.

pfranz · on Nov 16, 2021

Thanks for taking the time to share!

That's sounds like the general approach I take for "projects" even toy projects. My day jobs have never fit the virtualenv use-case. So at home I often have to look up how to use it. It's so rare that when I make an alias I even forget those.

Most new things are one-off scripts; move or rename some files, extract data from something, or pull from a resource. Something that requires libraries or is too big for a shell script. For example, the last one I see in my bin is a web scraper for appointments. It pulls a website, fills out a form, and gets the result a few times--about 70 lines. What's annoying is sourcing some environment just to run this one tool.

Most people have a directory of scripts (a mix of shell, Perl, or Python) they use if they spend a lot of time at a commandline. It's quite a pain to source the environment just to run a quick script. That's generally the libraries I install into user. I don't care much about the version and troubleshoot things as they come up.

isolli · on Nov 16, 2021

Hm, why? I'm a happy user of PYTHONPATH!

doubleunplussed · on Nov 16, 2021

It's completely global, shared by all Python interpreters of all versions.

I set PYTHONPATH, but the code in that directory is solely small debugging utils of mine that I want available in every Python interpreter, and I make sure not to put anything more complex in there.

cturner · on Nov 16, 2021

"It's completely global"

It doesn't have to be. You can have a launcher for a project that sets PYTHONPATH just when you launch that project.

What is bad practice is to be setting PYTHONPATH in your .bashrc, and for the reason you give - that makes it global across python launches.

doubleunplussed · on Nov 17, 2021

That'll prevent it leaking to most things, but not to subprocesses of your application.

For example your application might interact with command-line tools written in Python, and unless you delete PYTHONPATH from the environment variables prior to launching any subprocesses, they'll inherit it. This could lead to subtle and confusing breakage.

cozzyd · on Nov 16, 2021

So is PATH and LD_LIBRARY_PATH. You just change those as you need to...

isolli · on Nov 16, 2021

Alright. I actually set it in a Docker container. It works well there.

fer · on Nov 16, 2021

The only justified situation I can find is when you are working on two (or more) independent components at the same time.

My pain point in particular with PYTHONPATH (or playing with sys.path) is that people tend to use it with the only purpose of making import lines shorter, which brings naming collisions of all sorts when you aren't creative enough.

cozzyd · on Nov 16, 2021

Yeah, how else do you git clone some random package and immediately use it without "installing" it?

PYTHONPATH is simple and obvious how to use, and is similar to using LD_LIBRARY_PATH and friends.

smallerfish · on Nov 16, 2021

Yes, python packaging is a mess.

And agreed, there are two separate use cases: development and using the software.

But, are distros creating too much work for themselves by trying to package every itty-bitty python library (and for that matter, every npm library)? Are distros doing anything more than scanning CVE databases with the library versions, or are they _actually_ auditing the versions they choose? (Not that there's much choice, since python also has a shitty story when it comes to backwards compatibility; if you're going with 3.10, there's possibly only one version of a given library that will work.)

Java has a commonly used "fat jar" approach which rolls up all dependencies into a single file. It's excellent. In the python world, this doesn't exist, because virtualenvs aren't portable. If that can be fixed (perhaps a specific section in requirements.txt that captures anything that needs to compile C for the platform) then a distributable virtualenv would become possible. Distros would then scan the application for vulnerabilities (via requirement.txt's manifest), build the distributable-virtualenv, and ship _that_. Python library maintainers don't have to do anything different (except, of course, use the standard way to declare dependencies).

debiandev · on Nov 16, 2021

> Are distros doing anything more than scanning CVE databases with the library versions, or are they _actually_ auditing the versions they choose?

Debian Developer here. Part of packaging work, for Python libraries or anything else, is to verify the reliability of the upstream developers, audit the code, set hardening flags, add sandboxing and so on.

I spotted and reported vulnerability myself and it's not uncommon.

BiteCode_dev · on Nov 16, 2021

Please don't take it the wrong way, I have a lot of respect for distros packagers and maintainers. I donate to debian, and I report bugs. I think you are basically heroes of the FOSS world, because without your invisible (and frankly thankless) work, mine wouldn't exist.

But come on, there are 300k entries on pypi, 200k more for perl and 160k for ruby. I'm not even counting the whooping 1.3M on npm because I assume this is considered taboo at this point.

You cannot package 0.0001% of that, not to mentions updates.

And unless distros make it as easy to package and distribute deb/rpm/etc than it is to use the native package manager, this means distro packaging will never be attractive to most users because:

- they don't have access to most packages

- the provided packages are obsolete

- packages authors have no way to easily provide an update to the users

- it's very hard to isolate projects with various versions of the same libs, or incompatible libs

And that's not even mentioning that:

- package authors may not have the right to use apt/dnf on their machine.

- libs may be just a bunch of files in a git repo, which pip/gem/npm support installing from

- this is not compatible with anaconda, which has a huge corporate presence

- this is not compatible with heroku/databrics/pythonanywhere, etc

- this is hard to make it work with jupyter kernels

Now let's say all those issues are solved. A miracle happens. Sorry, 47 miracles happen.

That would force the users to create a special case for each distros, then for mac, then for windows. I have a limited amount of time and energy, I'm not going to learn and code for 10 different packaging systems.

It's not that we want to screw over linux distros. It's that it's not practical, nor economically viable to use the provided infra. The world is not suddenly going to slow down, vulnerabilities will not stop creeping up, managers will not stop to ask you to use $last_weird_things. This ship has sailed. We won't stay stuff only published 5 years ago with delays of months for every updates.

yaantc · on Nov 16, 2021

Thanks for your work. And I have to say that for internal development, when there is no need for the latest features, it's much easier to develop based on a Debian release as much as possible. A stable distribution provides an easy to track baseline not only for Python libraries, but any other tools that may be needed.

keymone · on Nov 16, 2021

there's https://github.com/spotify/dh-virtualenv

smallerfish · on Nov 16, 2021

That's cool. Anybody using this? How robust is it?

keymone · on Nov 16, 2021

it's decent. though these days building container and shipping it with a `docker exec` shim is much easier.

nerdponx · on Nov 16, 2021

There is Pyinstaller for this kind of thing: https://www.pyinstaller.org/

Nuitka can also apparently compile an entire dependency tree into one binary.

cdent · on Nov 16, 2021

I'm always confused by these sorts of posts because they happen often so there is clearly a problem but for some reason I've never had much of an issue. I've been using and developing with Python for about 15 years. In that time I've worked on Python projects large (OpenStack) and small (gabbi) and taken over maintenance for some old standbys (wsgi-intercept, paste to name two). Dealt with the 2->3 transition. Released a whole bunch of things to PyPI and relied on far more things that I've pip installed from there. It's been fine.

I don't know what I'm doing differently.

WhyNotHugo · on Nov 16, 2021

So there's a few types of projects you can write in Python:

1. Server applications that run in a dedicated environment.

2. Tools you write and run just on your machine (or some virtualenv, whatever).

3. Redistributable cli or desktop applications which end users will install and use.

For the first two types, you should never have any issues with Python and its dependency situation. You pin everything, and that's it.

For the third kind tho, it's complete pain. Different distros ship different Python versions, so you need to support all of them. You also have to consider that dependencies can't be an EXACT version, you have to support a range of them, and a variety of combinations.

And then, one dependency has a version that works in Python3.6, and another for 3.9. But they had an API change, so which one do you use? It'll break for half your users either way. Of maybe just put some `if version <= 3.6` all over the place, like we did during the py2->py3 transition?

PeterisP · on Nov 19, 2021

#3 is the exact case where you also want to just pin everything. For an end-user desktop application, you ship a properly tested bundle, instead of trying to support all the different versions; as the end-user (unlike a developer using a lib on their own machine) should not ever have to interpret compatibility issues and should get a package that's been tested to work as a whole.

If a distro ships python 3.6 and the app wants to use 3.7, then the end result must include python 3.7 as well, either by distro being capable of having both versions at the same time or the app needs to ignore the distro-python and ship its own version in the package.

WhyNotHugo · on Dec 3, 2021

> For an end-user desktop application, you ship a properly tested bundle, instead of trying to support all the different versions

Good luck trying to get such a bundle packaged into any mainstream distribution.

1337shadow · on Nov 16, 2021

OpenStack uses pbr which they created and which is pretty cool. I use setupmeta for the same purpose but could have used pbr.

https://github.com/openstack/pbr https://github.com/codrsquad/setupmeta

sufehmi · on Nov 16, 2021

Just scroll down - loads of horror stories shared by others

captainmuon · on Nov 16, 2021

Distros, please stop screwing over Python packaging. It is incredible that Debian/Ubuntu pick Python apart and put modules into different packages. They even create a fake venv command that tells you to please install python-venv.

What they should just do is offer a bunch of packages like python3.7, python3.8 that install the official python package wholesale into /usr/python or someplace and then symlink one of them to `python`.

If I would get to redesign package management (both for Linux distros and for languages), I would have one package manager that installs everything into a global package cache, and then pick the correct version of libraries at run time (for Python: at import time). Get rid of the requirement that there is only one stable (minor) version of a package in the distribution at one time. This has become unworkable. Instead, make it easy to get bleeding edge versions into the repositories. They can be installed side by side and only picked up by the things that actually use them.

3np · on Nov 16, 2021

The problem arises when non-Python packages depend on Python modules.

> If I would get to redesign package management (both for Linux distros and for languages), I would have one package manager that installs everything into a global package cache, and then pick the correct version of libraries at run time (for Python: at import time). Get rid of the requirement that there is only one stable (minor) version of a package in the distribution at one time. This has become unworkable. Instead, make it easy to get bleeding edge versions into the repositories. They can be installed side by side and only picked up by the things that actually use them.

You may want to check out Guix and Nix - their approach is pretty close to what you're describing.

A common solution to this is if you still want to run traditional distros is to just run "bare infra" (whatever that means) on the host OS and everything else in containers or Nix.

pimterry · on Nov 16, 2021

> Get rid of the requirement that there is only one stable (minor) version of a package in the distribution at one time.

I think this requirement made sense when disk space was scarce.

I think this requirement makes sense if you trust that your distro is always better at choosing the 'best' version of a dependency that some software should use than the software author.

Nowadays, I think neither is generally true. Disk is plentiful, distro packages are almost always far more out of date than the software's original source, and allowing authors to ship software with specific pinned dependency versions reduces bugs caused by dependency changes and makes providing support for software (and especially reproducing end-user's issues) significantly easier.

Isolating dependencies by application, with linking to avoid outright duplication of identical versions (a la pnpm's approach for JS: https://pnpm.io/) is the way to go I think. Honestly, it feels like the way it's already gone, and it's just that the distros are fighting it tooth & nail whilst that approach takes over regardless.

josefx · on Nov 16, 2021

> a la pnpm's approach for JS: https://pnpm.io/

Ah JS, how many days has it been since the last weekly "compromised npm package infecting everything" problem? If you are upholding that as gold standard you have to be the worlds laziest black hat.

> Disk is plentiful,

I recently had to install a chrome snap because it is the new IE6 and everyone is all over chrome exclusive APIs as if they were the new ActiveX. Over a gigabyte of dependencies for one application and given the trend of browser based desktop applications? I would like to have space left for my data after installing the programs I need for work.

kijin · on Nov 16, 2021

It's not just about disk space, though.

Distros assume responsibility for fixing major bugs and security vulnerabilities in the packages they ship. Old versions often contain bugs and vulnerabilities that new versions don't. Distros have two choices here: either ship the new version and remove the old version, or backport the fix to the old version.

Continuing to ship the old version without the fix is not an option -- even if you also ship the new version -- because some programs will inevitably use the old version and then the distro will be on the hook for any resulting hacks. Backporting every fix to every version that ever shipped is also not a realistic project.

Here in the startup world we often forget that there's a whole other market where many people would gladly accept 3-year-old versions in exchange for a guarantee of security fixes for 5-10 years. Someone needs to cater to this market, and the (non-rolling) distros perform that thankless task because individual developers won't.

scotty79 · on Nov 16, 2021

> Distros assume responsibility for fixing major bugs and security vulnerabilities in the packages they ship.

I think they should just ship Python programs, not libraries. They could check what libraries given Python program uses are safe in the version that it uses them.

And just don't care if each of Python programs has a separate copy of the libraries or if particular version of particular library is shared between Python programs by Python environment.

Distributions might just give up responsibility for sharing Python packages between Python programs without giving up the responsibility for security of those programs.

kijin · on Nov 16, 2021

So, patch two dozen copies of slightly different versions of the same library included in all those programs, instead of patching just one?

thereddaikon · on Nov 16, 2021

Why not? Its cheap resource wise whereas dependency hell is potentially debilitating. For some reason many proponents of the package management status quo are blind to this. Having multiple versions of a dependency is only bad in so far as its "messy". It isn't objectively bad. But having a system that breaks applications because two or more cant agree on a package version is objectively bad. Its arguing aesthetics versus getting the job done. A poor position.

Windows, for all its faults, doesn't have this problem. It will happily accommodate multiple versions of say, .Net as needed.

kijin · on Nov 16, 2021

Disk is cheap. RAM is cheap. Man-hours are not. Distros are maintained by people, who are often volunteers. You are asking them to do extra work (i.e. porting the same patch to multiple versions of the same library) so that someone else can have it easy. But why should they? Why not the other way around?

It's not just aesthetics. If a new vulnerability is found in, say, libjpeg, then every Windows app that uses libjpeg needs to be updated separately. Tough luck if your favorite image manipulation tool takes a while to catch up. On the Linux side of the fence, the distro patches libjpeg and every app is automatically fixed. This is a huge win in terms of protecting the user. Why should we give up that benefit just because some developer wants to use his own special snowflake version of a library?

thereddaikon · on Nov 16, 2021

Not managing that would require less work not more. Their position is making more work for themselves. The point was to prevent dependency hell through matching the wrong package versions. Something that can occasionally happen in Windows. The problem is, the current form of management causes the far worse form of dependency hell of applications requiring conflicting versions.

I have maybe had Windows get confused about dependency versions twice ever and both times it was a driver inf for a virtual device. I will grant that fixing the problem required a fair amount of work by Windows standards but frankly not all that much by the standards of some of the more hands on distros.

I have had Linux tell me I can't install an application because it wanted a different version of Lib-whatever than what something else wants many many times.

"Why should we give up that benefit just because some developer wants to use his own special snowflake version of a library?"

Odd that you claim major distros are built by a small group of volunteers but the maintainers of much smaller and less well supported applications need to suck it up and use whatever version the Distro maintainers decide on.

Most major distros are not volunteer run and haven't been for ages. Ubuntu, RHEL, SUSE, POP, the list goes on. These are commercial products with full time paid developers. In the case of Ubuntu they are providing a major chunk of the work back upstream to Debian and in the case of RHEL they are the upstream. Most minor distros are downstream benefactors of the big players.

Contrary to that its still common for many FOSS apps and utilities to be one man jobs. Maybe the guy doesn't have the resources to keep up with the break neck pace of some update cycles. What if they decided to go with an LTS build intentionally? What if its a simple package that doesn't have security issues yet gets updated for other reasons? What if the version they are using has core functionality that was EOL'd in a newer release so they can't move on without major rework that they can't manage?

There are a million reasons for why a project may want to stick with an older version. Also allowing for the ability to update all packages does not require draconian control of which packages can be installed. This notion runs against the whole notion of user control. If the user wants multiple concurrent versions on their system who are you to say they can't? FOSS means freedom.

scotty79 · on Nov 16, 2021

You should patch two dozen python programs as a whole that use vulnerable version of libraries. Treat Python programs as if each of them was a single executable file and you only have information which versions of libraries it has inside it. And if any of those is known to be vulnerable treat whole program as a security threat.

selfhoster11 · on Nov 16, 2021

> Disk is plentiful,

What makes you think so? SSDs aren't exactly stellar in the cost-per-TB department, as will be the case with each new higher-performance storage technology. Plenty of people cannot afford the prices of new Western tech either, what about them?

AnIdiotOnTheNet · on Nov 16, 2021

> SSDs aren't exactly stellar in the cost-per-TB department

First of all, 1TB for binaries and libraries may as well be infinite. Secondly, you can get a 1TB SSD for under $100, which is pretty damned inexpensive when you consider it took until 2009 to get HDDs that affordable.

skykooler · on Nov 16, 2021

On a desktop, sure. Now that more and more laptops are starting to have soldered-down storage, this argument falls apart.

Fatnino · on Nov 16, 2021

If you have a laptop with only half a gig soldered in you get what you pay for.

It's almost as bad as complaining about having trouble running modern stuff with your 286

amalcon · on Nov 16, 2021

It's plentiful relative to the size of compiled or source code. E.g the biggest .so file on this system right now is a <150MB libxul.so. That's only used by one piece of software anyway, and the drop-off is pretty steep after that. A 64GB drive (tiny these days) can fit more than four hundred of that unusually large file.

silvestrov · on Nov 16, 2021

> cost-per-TB

There you have it: You measure in TB, not gigabytes, not megabytes.

Python packages are megabytes.

josefx · on Nov 16, 2021

Not if they pull in all of their dependencies, PyQt would have a complete copy of all Qt binaries and a complete chrome install because of course Qt includes a browser based html viewer. Python packages are gigabytes.

zamadatix · on Nov 16, 2021

What distro is pulling PyQt as a dependency of Python? There is a difference between "dependency" and "every package which has the word python in the description".

mccorrinall · on Nov 16, 2021

PyQt only contains the bindings. You share the same Qt environment across your system (hence qmake needs to be in your path). The python package itself is not that big (~10 MB).

josefx · on Nov 16, 2021

The comment on top of this chain is about letting every package specify its own versioned dependencies. So how would that global version work out when python needs 5.1 and some other software specifies 5.2?

mccorrinall · on Nov 16, 2021

Qt minor releases are backwards binary and source compatible, but not forward. Therefore this would not be an issue.

josefx · on Nov 16, 2021

That guarantee only applies to Qt itself, I would expect that the newer Qt binary was also compiled against all the other newest versions of its own dependencies. Good luck finding a backwards compatibility promise for all of them.

thereddaikon · on Nov 16, 2021

Fortunately there are a thousand gigabytes in a terabyte and nobody is using a system with single or double digit gigs of storage anymore.

Unless you are trying to save a buck it seems 1TB is the standard today. My primary desktop has 4.

Delk · on Nov 16, 2021

> Unless you are trying to save a buck it seems 1TB is the standard today.

I suppose part of the problem is that while getting a 1 TB SSD instead of a 512 GB or even 256 GB one may not be overly expensive (for a middle-class person in a wealthy country anyway), due to the way OEM laptop product lines are often stratified, you may need to either buy your 1 TB SSD separately or get an altogether higher-specced model than your perhaps otherwise would. The latter especially isn't cheap.

There might be some customization options but sometimes little customization is available. That's probably one of the ways people end up with relatively small-capacity SSDs.

It's kind of similar as with RAM: a higher capacity isn't that much more expensive in theory, but in practice it may be.

This is irrelevant for custom-build desktops but lots of people are running only laptops nowadays. I'd like to see better customizability for the builds, as well as upgradability and replaceability, but the options are often limited.

josefx · on Nov 16, 2021

> Unless you are trying to save a buck it seems 1TB is the standard today.

Buying a 1TB external SSD would more than double the cost of a raspberry pi 4 that and my ancient beagle board does fine running from 32 GB.

> My primary desktop has 4.

Those are rookie numbers for a primary system. Of course my Office system next to it is a lot lower specked with the test system next to it even lower.

zamadatix · on Nov 16, 2021

Embedded systems really shouldn't be brought into play here but even then a 256 GB uSD card for the Pi is $25 dollars and itself far overkill. My entire primary desktop OS, firmware, DEs, and very extensive package set fits in 15 GB. Multiplying my primary system by 10 and sticking it on a Pi taking $25 of storage with plenty to spare is still not an argument against binary sizes, especially since there are niche distro spins used for that niche space anyways.

thereddaikon · on Nov 16, 2021

Eh I think embedded systems are a different class. I know some people run RPis as their desktops but they are definitely the minority.

>Those are rookie numbers for a primary system

4TB suits my needs and use case fine. And tha'ts not counting my NAS.

cozzyd · on Nov 16, 2021

The eMMC on a Beaglebone Black is 4GB. Sure you can boot off an SD card but that's less robust (though, I guess you can use the SD card for all your virtualenvs...).

cozzyd · on Nov 16, 2021

A typical virtualenv for a project I work on is > 500 MB. That adds up.

danielheath · on Nov 16, 2021

Text (eg python source code) is really quite small. Far too small to count unless you're a distro specifically targeting low-end machines.

goodpoint · on Nov 16, 2021

> I think this requirement made sense when disk space was scarce.

No, the main reason is security. I need a distro to guarantee me that the library that I use are going to stay the same for the next 3 to 5 years, while also receiving small targeted patches.

> I think this requirement makes sense if you trust that your distro is always better at choosing the 'best' version of a dependency that some software should use than the software author.

No, it just has to be better at choosing version than picking them randomly using pip.

Furthermore, when thousands of developers use the same combination of libraries from a distro the stack receives a ton of testing.

The same cannot be said when each developer pick a random set of versions.

cozzyd · on Nov 16, 2021

One case where disk space is somewhat scarce is on shared academic computing clusters (which often provide many versions of things via some module system, but your $HOME can have a quota that's just 30GB).

nerdponx · on Nov 16, 2021

Homebrew does this very well with its "cellar" system. Every version of every package gets installed to its own root tree, eg `/usr/local/Cellar/python/3.9.7/`. The currently-active version is then symlinked into `/usr/local/opt/python` and from there into `/usr/local`.

I believe Nix and Guix also work this way.

toyg · on Nov 16, 2021

I'll remember that "Homebrew does this very well" the next time I have to fix a bunch of shit because it has updated the currently-active version, or removed this or that bugfix release, as part of a general upgrade. After the third time this happened, I started using pyenv - which is another mountain of brokenness, I grant you, but at least I have some degree of control on what happens when.

drunkpotato · on Nov 16, 2021

pyenv is pretty good for working around this problem. I've recently switched to asdf-vm which I like even more, since it handles versions for multiple languages and tools.

nerdponx · on Nov 16, 2021

I have been meaning to try out ASDF-VM. Currently my shell initialization script has at least 4 of these "version managers". While I don't really mind them (and they are mostly well-behaved), it might be nice to have something a bit more centralized.

drunkpotato · on Nov 16, 2021

In addition to python version issues, I was also running into JVM and gradle version compatibility issues, which I was handling with jenv and some aliases that would swap JAVAHOME environment variable as needed. asdf-vm cleaned all that up in a very clean way, and I like the way you can set .tool-versions file for a project and share with other asdf-vm users.

jacquesm · on Nov 16, 2021

I'm using Anaconda, it's so far hands down the best way to manage python and its packages. It just works.

dagw · on Nov 16, 2021

hands down the best way to manage python and its packages.

Agreed, especially on Windows.

It just works.

This is pushing it. It's not hard to break conda or put yourself in situation where the updater/dependency checker gets stuck and doesn't know what to do, especially once you start adding conda-forge packages. But it does do a better job than anything else I've tried (although poetry + pyenv on Linux is getting much better)

pwang · on Nov 16, 2021

Thanks!

FWIW, we are soon going to be releasing a much faster depedency resolver. We are also thinking hard about how best to address the "growing ecosystem" problem, in a future-proof way.

dagw · on Nov 16, 2021

FWIW, we are soon going to be releasing a much faster depedency resolver

Fantastic. Easily my biggest complaint about conda.

jacquesm · on Nov 16, 2021

Maybe I've just been lucky so far but I have yet to see Anaconda break anything over a period of several years. So far so good :)

AnIdiotOnTheNet · on Nov 16, 2021

IIRC GoboLinux was the first distro to do things this way. Sadly, it didn't catch on and the Linux world doubled down on labor intensive volunteer package maintenance.

nerdponx · on Nov 16, 2021

Great shout out. I still ought to try using it one of these days! It seems like a good option for people who want a better file system hierarchy without the extra complexity of Nix/Guix.

xyzzy_plugh · on Nov 16, 2021

Homebrew does this kinda poorly compared to Nix and Guix. They are a different breed.

For starters, there is no /usr/local symlinking process. It's also possible to have multiple versions of e.g. python installed and active. Homebrew is like a poor-man's Nix.

hk1337 · on Nov 16, 2021

I hate that Homebrew uses /usr/local. At least on M1 they had to move it to /opt but I always install it to my home directory in ~/.brew. I can override the paths and not have to worry about file/directory permissions.

rizkeyz · on Nov 16, 2021

This is the real legacy problem: Python comes from a world, where only one version of one packaged seemed the right way to do.

I do not have a good idea, but other ecosystems evolved much more sane in the realm of packaging. While not ideal, Go has done a fairly good job - and the "module" operations are instant - which they should be.

1337shadow · on Nov 16, 2021

Ok but Go compiles statically, while you can do the same with pyinstaller, I don't think that's really comparable as we're talking about deployment right there.

rizkeyz · on Nov 16, 2021

Static binaries are a different story. Go has dependencies as any other modern language and they had a bad story in the past and have a better story today.

Sketch for python: Create a ~/.cache/python/packages directory. Manage all dependencies there. Make the python interpreter "package aware" so that required dependencies are read off a file from the current project (e.g. "py.mod") and adjust "system path" accordingly and transparently. Or something along those lines.

No extra tool, a single location, an easy to explain workflow (add a py.mod file, add deps there with versions, etc).

I'm just thinking out loud, but it does not need to be hard.

1337shadow · on Nov 16, 2021

> Manage all dependencies there

Are you saying "put the different version of every dependency you need in there if you have to"?

Because I don't think package managers are ready for that, they usually like to have one version installed per package.

coldtea · on Nov 16, 2021

>Because I don't think package managers are ready for that, they usually like to have one version installed per package.

That's on them. Other language package managers can do it...

1337shadow · on Nov 16, 2021

Exactly, the whole topic is about easing the process for Linux distro packagers right?

goodpoint · on Nov 16, 2021

Go is terrible for distributions to package due to the poor versioning of libraries.

goodpoint · on Nov 16, 2021

Despite the downvotes, the argument stands: linux distributions are having a hard time handle the amount of tiny libraries and the conflicts in versioning and many maintainers voiced their concerns in the past years.

The HN bubble can be amazing sometimes.

PeterisP · on Nov 19, 2021

The point echoed in this discussion multiple times is that distributions should not handle the tiny python libraries and attempt to solve the dependency version issues, but treat an application with all its dependencies included as a single package. If a dependency needs to be bumped a version for e.g. security purposes, then the app obviously wasn't tested with the new version (which didn't exist at the time) and needs to be retested, repackaged and rereleased for the update. This would cut down on the number of packages to be maintained, as the vast majority of python libraries would be exclude from the direct packaging process.

goodpoint · on Nov 19, 2021

> If a dependency needs to be bumped a version for e.g. security purposes, then the app obviously wasn't tested with the new version (which didn't exist at the time) and needs to be retested

The burden of updating multiple copies of the same library across many packages grows exponentially and is simply untenable for distributions.

If you can find an army of volunteers to do that, distributions would love their contributions.

This hasn't happened in the last 20 years. I'd love to be proven wrong.

PeterisP · on Nov 19, 2021

Since simply updating the dependency can easily break the resulting package, this need to re-test is not something that can be avoided by making some other choice of packaging e.g. the current one - it's not adding a new burden, it's acknowledging that it already exists (indeed, IMHO much of what the original article complains about). If there are no resources to carry that burden, then the only option seems to be to wait for an updated release from the upstream, whenever that arrives.

goodpoint · on Nov 20, 2021

I wrote "The burden of updating". Testing still needs to be done but there's a lot of automation to minimize the workload.

> If there are no resources to carry that burden, then the only option seems to be to wait for an updated release from the upstream, whenever that arrives.

No, most upstreams do not backport security fixes. And switching to a newer release is not an option if you want to provide stability to users.

jatone · on Nov 16, 2021

go is incredible for distribution. you're distributing binaries not a bunch of source files.

now the build tooling still needs some work around versions but its only a minor problem generally as much as it annoys me personally.

hk1337 · on Nov 16, 2021

That sounds similar to what I do in macOS. I hate installing homebrew to /usr/local so I started installing it to ~/.brew and I hate using the python from homebrew so I always use pyenv.

thereddaikon · on Nov 16, 2021

This behavior is why things like Snaps and Flatpacks have become so popular. Package managers operate under a draconian and outdated mindset that gets in the way more than it helps at this stage.

You can both allow different versions of the same packages to coexist while also managing updates and installation/removal of software. It doesn't have to be this way. Software should be able to ship with its dependencies included and work and not rely on the whims of the OS getting it right.

hnarn · on Nov 16, 2021

> Get rid of the requirement that there is only one stable (minor) version of a package in the distribution at one time.

I’m not exactly sure how it works but I think I’ve heard that newer releases of Enterprise Linux (EL8+) support multiple channels of the same package or something similar.

cassianoleal · on Nov 16, 2021

> I would have one package manager that installs everything into a global package cache, and then pick the correct version of libraries at run time

Essentially what Bundler has done for Ruby since pretty much forever.

I would love to see this pattern elsewhere.