Hacker News new | past | comments | ask | show | jobs | submit login

I find it very telling that both memory hotspots were explicit loops, and the fixes were to use implicit looping constructs (range and a list comprehension). I wonder if the underlying problem was the loop, or the constant append() calls.



That's an interesting point - there are competing factors here. The list comprehension is usually a bit more efficient due to not having to look up the local variable for the list on every iteration, I think. Take a look at this (albeit a bit old) article:

https://wiki.python.org/moin/PythonSpeed/PerformanceTips#Loo...

I imagine that lookup overhead would happen on every iteration, but the list add only happens when the conditional evaluates to true - so it seems a bit less likely that the append is the overhead, but maybe the cost of allocating memory is so bad (it is a syscall after all) that it compensates for the infrequency.

Generators would help with this too: instead of materializing and appending the list at the end every time (i.e. [2] + [x for x in s if x]), you could do something like itertools.chain([2], (x for x in s if x)). Then you avoid ever allocating a massive list -- provided that the rest of your code avoids that too.


Well, the fact that they're running it in PyPy makes all of this that much more challenging to debug - you'd have to look at the runtime-generated bytecode to identify the actual causes.


Localizing append helps, but that's not all of it:

Building a list of 10,000,000 items:

List comprehension: 599ms For loop: 930ms For loop with append localized: 736ms


Comprehensions have their own opcodes which directly call e.g. PyList_Append with no further checks or lookups in each iteration.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: