If you started only making new stuff in python 3 in 2010 2 years after python 3 ...

dwheeler · on Jan 24, 2021

> If you started only making new stuff in python 3 in 2010 2 years after python 3 came out how much python 2 would you have to convert?

In many organizations there was never a time where they could start writing new code in Python 3. They needed to write code that was compatible with their existing python 2 code, and the only way to do that is to continue to write new code in python 2. Rinse, lather, repeat.

This is why the failure to provide a gradual transition was so bad. When I write new code in Python I use python3, but that assumes that there are python3 modules available that I need.

If you have infinite money this is not a problem. But I think we should be sympathetic to the people who do not have infinite money and have never been given a realistic upgrade path from 2 to 3. The 2to3 program is not a workable solution for many.

pseudalopex · on Jan 24, 2021

You could write Python 2 that was compatible with Python 3 or at least would be easy to convert.

user5994461 · on Jan 24, 2021

Not until the year 2015 and python 3.5 around the corner.

That's when the interpreter finally got the minimum support required to make code compatible with both, and linters improved enough, and some libraries started being ported.

pseudalopex · on Jan 24, 2021

six 1.0 was in 2011.[1] 50% of the top 200 packages were compatible by the end of 2012.[2] And there were features you could use in 2008 to make the eventual conversion easier.

[1] https://github.com/benjaminp/six/commits/1.0.0

[2] https://python3wos.appspot.com/

dwheeler · on Jan 29, 2021

> 50% of the top 200 packages were compatible by the end of 2012.

Which meant you could not usually convert, since ALL dependenices had to be converted. The chance of doing that successfully then with 300 libraries (including transitive dependencies) was approximately (0.5)^300, which is practically 0.

user5994461 · on Jan 24, 2021

Hence the ecosystem became usable around 2015 ;)

It takes time between the initial release and enough fixes/additions to become usable, and the documentation/stackoverflow to cover it.

pseudalopex · on Jan 24, 2021

Around 2015 is when lots of people found all their dependencies were compatible with Python 3 or abandoned and replaced. How much work they had to do depended on if they put any effort into compatibility for the last 7 years. Or even just followed recommendations for Python 2. Writing 100% compatible code wasn't practical in 2008 but distinguishing bytes from text was.

higeorge13 · on Jan 24, 2021

People don't like changes.

I joined an AI company on 2018 which was building their whole prototype with python 2.7 since 2017. I had to spend 2 weeks and do the migration myself, do a merge request out of the blue and give it to their lead engineers, otherwise i am pretty sure they would have a meeting tomorrow 24/1/2021 to see how they are going to migrate.

pishpash · on Jan 24, 2021

What about code people wrote before 2010 that were perfectly fine? Are you going to have people rewrite research algorithms whose original authors have long graduated?

Just because industry has a habit of rewriting the whole stack every five years on account of make-work job security doesn't make foundational scientific algorithms change.

bastawhiz · on Jan 24, 2021

Except the Python folks produced tools that did almost all of the work for you. 2to3 has worked for an overwhelming number of use cases. And in fact, most code requires very few changes to begin with.

If this academic code isn't well understood or well tested, it's probably not as valuable as you might think.

rleigh · on Jan 24, 2021

2to3 does 95% of the job. The other 5% requires manually fixing up all the bits it missed, and until you do that, your codebase will be subtly (not not so subtly) broken. The most common problem I saw was relating to iterating over dictionary keys.

That 5% that requires manually fixing up is the sticking point. You still need to audit every line of the codebase, and each line that gets missed is a guaranteed bug introduced to your codebase by the conversion. This is not much of an issue for small scripts or tiny programs. It is an issue for big applications. This migration really highlights (yet again), the dangers of using interpreted languages at scale. With no compiler to pick up errors, no typechecking by default etc., identifying all of the remaining faults is a huge task.

Like it or not, this is a huge risk to a business. There is a risk of introducing vast quantities of bugs, and there a huge developer cost to performing the migration.

For the record, I have migrated several medium-sized codebases with 2to3 and python-modernize. Because these were internal tools with defined inputs and outputs, it was trivial to validate that the behaviour was unchanged after the conversion. But for most projects this will not be the case.

The 2 to 3 conversion will be a textbook case of what not to do for many decades to come. For the many billions it will cost for worldwide migration efforts, the interpreter could have retained two string types and handled interpreting both old and new scripts. The cost would have been several orders of magnitude less.

dwheeler · on Jan 24, 2021

> 2to3 has worked for an overwhelming number of use cases. And in fact, most code requires very few changes to begin with.

That is not at all my experience, nor is that experience of most other people I've ever talked to.

__turbobrew__ · on Jan 24, 2021

I ran 2to3 on 1 of about 200 modules at my workplace codebase and it took me about an hour to run 2to3 and fix up the leftovers in that module.

Now do that for 199 other modules and it becomes much more work than “just run 2to3”.

bastawhiz · on Jan 25, 2021

I migrated a large codebase with 2to3 and six. It took me half a day. 2to3 did roughly 60-75% of the work and six cleaned up almost all of the rest. It's not foolproof, but it is surely better than nothing.

It's perhaps also worth noting that I did this in late 2017. Your experience likely varies depending on when you attempted it.

joshuamorton · on Jan 24, 2021

In the migration I was a part of (probably one of the larger ones, period), I think something like 90% of code could be migrated by automation. And of the remaining 10, most of it needed only trivial human oversight.

That remaining two percent had a lot of painful things (truly, I have some stories), but "the overwhelming number of use cases" was trivial.

michaelmrose · on Jan 24, 2021

Couldn't you isolate the entire outdated environment this 2002-2009 code is expected to run in?

Plenty of code written in this time frame is liable to depend on unsupported things by that point.

philderbeast · on Jan 24, 2021

if the code still has value being run in modern environments, then it probably needs to be maintained and updated anyway.

if its not being used, then does it running with the latest tools even matter?

lanstin · on Jan 24, 2021

Not really. For example R ships with or depends on a lot of FORTRAN libraries. I doubt they have changed much at all in decades. There is no talk of a breaking FORTRAN language change that would require rewriting this perfectly FB ctional code with a stable interface.