> If you started only making new stuff in python 3 in 2010 2 years after python 3 came out how much python 2 would you have to convert?
In many organizations there was never a time where they could start writing new code in Python 3. They needed to write code that was compatible with their existing python 2 code, and the only way to do that is to continue to write new code in python 2. Rinse, lather, repeat.
This is why the failure to provide a gradual transition was so bad. When I write new code in Python I use python3, but that assumes that there are python3 modules available that I need.
If you have infinite money this is not a problem. But I think we should be sympathetic to the people who do not have infinite money and have never been given a realistic upgrade path from 2 to 3. The 2to3 program is not a workable solution for many.
Not until the year 2015 and python 3.5 around the corner.
That's when the interpreter finally got the minimum support required to make code compatible with both, and linters improved enough, and some libraries started being ported.
six 1.0 was in 2011.[1] 50% of the top 200 packages were compatible by the end of 2012.[2] And there were features you could use in 2008 to make the eventual conversion easier.
> 50% of the top 200 packages were compatible by the end of 2012.
Which meant you could not usually convert, since ALL dependenices had to be converted. The chance of doing that successfully then with 300 libraries (including transitive dependencies) was approximately (0.5)^300, which is practically 0.
Around 2015 is when lots of people found all their dependencies were compatible with Python 3 or abandoned and replaced. How much work they had to do depended on if they put any effort into compatibility for the last 7 years. Or even just followed recommendations for Python 2. Writing 100% compatible code wasn't practical in 2008 but distinguishing bytes from text was.
I joined an AI company on 2018 which was building their whole prototype with python 2.7 since 2017. I had to spend 2 weeks and do the migration myself, do a merge request out of the blue and give it to their lead engineers, otherwise i am pretty sure they would have a meeting tomorrow 24/1/2021 to see how they are going to migrate.
What about code people wrote before 2010 that were perfectly fine? Are you going to have people rewrite research algorithms whose original authors have long graduated?
Just because industry has a habit of rewriting the whole stack every five years on account of make-work job security doesn't make foundational scientific algorithms change.
Except the Python folks produced tools that did almost all of the work for you. 2to3 has worked for an overwhelming number of use cases. And in fact, most code requires very few changes to begin with.
If this academic code isn't well understood or well tested, it's probably not as valuable as you might think.
2to3 does 95% of the job. The other 5% requires manually fixing up all the bits it missed, and until you do that, your codebase will be subtly (not not so subtly) broken. The most common problem I saw was relating to iterating over dictionary keys.
That 5% that requires manually fixing up is the sticking point. You still need to audit every line of the codebase, and each line that gets missed is a guaranteed bug introduced to your codebase by the conversion. This is not much of an issue for small scripts or tiny programs. It is an issue for big applications. This migration really highlights (yet again), the dangers of using interpreted languages at scale. With no compiler to pick up errors, no typechecking by default etc., identifying all of the remaining faults is a huge task.
Like it or not, this is a huge risk to a business. There is a risk of introducing vast quantities of bugs, and there a huge developer cost to performing the migration.
For the record, I have migrated several medium-sized codebases with 2to3 and python-modernize. Because these were internal tools with defined inputs and outputs, it was trivial to validate that the behaviour was unchanged after the conversion. But for most projects this will not be the case.
The 2 to 3 conversion will be a textbook case of what not to do for many decades to come. For the many billions it will cost for worldwide migration efforts, the interpreter could have retained two string types and handled interpreting both old and new scripts. The cost would have been several orders of magnitude less.
I migrated a large codebase with 2to3 and six. It took me half a day. 2to3 did roughly 60-75% of the work and six cleaned up almost all of the rest. It's not foolproof, but it is surely better than nothing.
It's perhaps also worth noting that I did this in late 2017. Your experience likely varies depending on when you attempted it.
In the migration I was a part of (probably one of the larger ones, period), I think something like 90% of code could be migrated by automation. And of the remaining 10, most of it needed only trivial human oversight.
That remaining two percent had a lot of painful things (truly, I have some stories), but "the overwhelming number of use cases" was trivial.
Not really. For example R ships with or depends on a lot of FORTRAN libraries. I doubt they have changed much at all in decades. There is no talk of a breaking FORTRAN language change that would require rewriting this perfectly FB ctional code with a stable interface.
If of course you spent an extra decade producing obvious technical debt whose fault is it?