I'm annoyed at the reason that any/all have to be on this list. If they (and map, filter, …) were methods, you could just write `foo.` and your IDE could show you what methods are available. Postfix would make things easier to read too:
To follow the flow of data/control, you start in the middle, go right, then skip left to filter, read rightwards to see which filter, skip left to map, read rightwards to see what map, go left to min, then skip all the way to the right. Just splitting it into multiple lines doesn't help, you need to introduce intermediate variables (and make sure they don't clobber any existing ones) and repeat yourself whether they clarify things or not. The same issue exists for list/dict/set comprehensions.
class WrappedList:
_fns = [map, filter, min, max, all, any, len, list]
def __init__(self, it):
self.it = it
def __getattr__(self, name):
for fn in self._fns:
if name == fn.__name__:
def m(*args, **kwargs):
result = fn(*args, self.it, **kwargs)
if hasattr(result, '__iter__'):
return self.__class__(result)
else:
return result
return m
def unwrap(self):
return self.it
This allows you to do stuff like
WrappedList([1, 2, 3, 4]).filter(lambda x: x % 2 == 0).map(lambda x: x * 3).list().unwrap() # [6, 12]
WrappedList([1, 2, 3, 4]).map(lambda x: x >= 5).any() # False
Deciding whether or not this is something you should do, rather than just something you can do, is left as an exercise for the reader.
Python debugging is most sane when the code just tries to keep it simple. After thousands of pdb sessions I can say most people should not be allowed to do this kind of thing in real code!
And herein we see a weakness of Python: There is no way to get rid of the lambda lambda lambda, without actually naming things using def. Even though we are defining a pipeline of steps, still have to put up with syntactic clutter. Compare with threading/pipeline in other languages.
One puzzling thing is that it uses backslash continuation in its examples. The most favoured style, IMO is to use ()'s for line continuation, maybe the author just doesn't know about those.
The reason they are bad is that intermediate results are never named (and thus are never explained). In simple situations, it's possible to infer from the context what the author's intention was, but in more complicated cases, if you want to understand someone's code, especially if it's written in the way you did, you'd have to "disassemble" it into simpler operations, name the variables (after investigating or guessing the purpose of each operation) and then try to come up with the full picture of what's going on.
Also, as a style suggestion: avoid using backslashes. In your situation, you could just put dots at the end of the line and it will be enough to not need the backslashes. It adds noise to your code, i.e. characters that add no meaning, just sort of a "scaffolding" to hold your code together.
In my own python toolbox (specifically for the list-of-dictionaries use-case) I inject .log(). calls into the pipeline as need to show what the actual intermediate values would be.
Naming intermediates is fine (and encouraged) if there are actually meaningful names to be given. But sometimes the expression itself is the shortest meaningful name for the expression.
Re backslashes, you can also just wrap the expression in parentheses.
A perhaps more appropriate name for your 'log' would be 'peek'. Log reminds of logging, which usually does not return a value, but writes to stdout or a file or similar.
You wouldn’t really write it as you have in the second example though. The Pythonic way of writing something like this is to use list comprehensions or generator expressions, for example:
min(some_op(item) for item in bar.baz() if some_filter(item)).foo()
Or decomposed a little for clarity:
processed_items = (some_op(item) for item in bar.baz() if some_filter(item))
min(processed_items).foo()
This is pretty readable – a natural language description of the first line is “do some_op for each item in bar.baz that matches some_filter”, which corresponds 1:1 with the code.
both examples seem pretty contrived and I think this comes down just what language you are used to. Their other code seems very JS-y.
I work in both python & js. The python reads like natural language:
Processed items is a set of some transformation of each item in bar.baz() where something is true for that item.
then Foo the smallest in that list.
It reads like english.
JS-y stuff doesn't read like natural language, but I do think its more concise and fits the IDE function discovery workflow better.
Both models can be made into horrid messes or elegant solutions. Both are highly readable.
Now I like the python one because I find it natural to attach contextual "whys" or "because" comments to them.
Processed items is a set of some transformation of each item in bar.baz() where something is true for that item.
then Foo the smallest item.
# because foo is a slow function and we don't want to foo every bar and baz
any, all, map, filter, min, max, for loops, zip, list, tuple, reduce, list comprehensions, cycle, repeat, islice, and so on in python work on iterables, and iterable is a protocol, not a class. it would certainly be interesting to program in a language where conforming to a protocol (perhaps one that nobody had thought up yet when you wrote your class) would give your class new methods, or where all iterables had to derive from a common base class, but it would be a very different language from python
incidentally in your example, though data does flow from top to bottom, control does not, assuming the filter and map methods are lazy as they are in python; it ping-pongs back and forth up and down the sequence in a somewhat irregular manner, sometimes reaching as far as .min() before going back up, and other times turning around at .filter(...)
i wonder if you could implement the ide functionality you want with a 'wrap' menu of popular functions that are applicable to the thing to the left of your cursor, so when you had
filter(some_filter, bar.baz())|
(with | representing your cursor) you could select `map` or `min` or whatever from the wrap dropdown and get
min(filter(some_filter, bar.baz()))|
for any given cursor position in python there are potentially multiple expressions ending there, in cases like
> it would certainly be interesting to program in a language ... where all iterables had to derive from a common base class, but it would be a very different language from python
You mean Ruby? :P
(All Ruby iteratables mixin Enumerable, which is baaaaaaaasically inheritance.)
Or Rust! Everything that implements the Iterator trait gets access to all of Iterator’s goodies, like map, filter, reduce, etc. Implementing iterator just requires adding a single next(&mut self) -> Option<Item> method on your type.
Lifetimes and async are a massive pain in rust. But the trait system is a work of art.
I like Rust's struct + traits approach, because they avoid inheritance and encourage composition. I am sure people have built bad workarounds though to do inheritance anyway.
trait MyIterHelpers: Iterator {
fn dance(&self) {
println!("wheee");
}
}
// And tell rust that all Iterators are also MyIterHelpers.
impl<I: Iterator> MyIterHelpers for I {}
The one caveat is that using it in a different context will need a use crate::MyIterHelpers; line, so the namespace isn't polluted.
> i wonder if you could implement the ide functionality you want with a 'wrap' menu of popular functions that are applicable to the thing to the left of your cursor
This is already implemented in IntelliJ for Java - they call it "Postfix Completion". For example you can type ".cast" after an expression to wrap what's before the cursor in a cast expression, so type "a + b.cast", then pick cast to "float", and pick how large a preceding expression you want to cast, and you can end up with "(float)(a + b)" and go from there. They have postfix completion that can extract expressions into variables, create if-statements and switch-statements from expressions, and so many more things that I wish I had when doing non-trivial Python coding in my IDE of choice (which is not by Jetbrains)...
> it would certainly be interesting to program in a language where conforming to a protocol (perhaps one that nobody had thought up yet when you wrote your class)
Not automatic, but you could use a decorator + the protocol as type annotation, I think
In my mind this is a holdover from when Python was much more procedural/C-like and as a Python developer it's one of my pet peeves. (I can't count how many times I've started writing the name of a list, had to backtrack to stick a `len` in front, and then tap tap tap arrow keys to get back to the front.)
I suppose we really ought to blame Euler for introducing the f(x) notation 300 years ago... Very practical when the function is the entity you want to focus on, often less useful in (procedural) programming, where we typically start with the data and think in terms of a series of steps.
Some languages like D and Nim have "UFCS", uniform function call syntax, where all functions can be called as methods on any variable. Basically, it decouples the implicit association between method dispatch and namespacing/scoping semantics. Rust also has something they call UFCS, but it only goes one way (you can desugar methods as normal functions, but you can't ... resugar? arbitrary functions as methods). Python couldn't implement this without breaking a lot of stuff due to its semantics, but it is definitely a feature I'd like to see more of.
> In my mind this is a holdover from when Python was much more procedural/C-like
That never existed. Or if it did, it was long before any trace exists, and there's trace from quite a way back when e.g. the first commit in which I can find the len() builtin (https://github.com/python/cpython/commit/c636014c430620325f8...) also has calls to file.read and list.append, and the first python-level methods are created just a few commits later (https://github.com/python/cpython/commit/336f2816cd3599b0347...). Though there may be missing commits, this is 30 in, back when Python was an internal CWI thing (although nearly a year in, according to the official timelines of the early days).
Thanks for the thorough correction. I think I was making that assumption due to the semantics of the language, which suggests classes and methods being somewhat "bolted onto" a dict-based core. Unfortunately for me, it makes me all the more dissatisfied with the choice.
Thanks. I may have already read that post (or I just correctly backtracked the reasoning), as I was pretty much convinced namespacing conflict (the second bit of rationale) was a factor for the dunder-ing of methods, but I had no source so ultimately decided not to put it in.
Or just use any text editor ever and use Ctrl+arrow to jump word-wise. The most common efficiency issue in editing is editor literacy, not editor featureset.
Good programming editors are designed with the idea that as you master the program, you become more precise in telling it what to do. When editing programs, the author usually applies several navigational schemes to interpret the text of the program: by structure, by syntactical elements, but geography of the screen.
To expand on this: examples of navigating by structure include moving by token / expression / definition. Examples of moving by syntax would be the search or "jedi" navigation (i.e. navigation where you enter a special mode requiring from you to type characters that iteratively refine your search results). Finally, simply moving up / down / left right by certain number of characters is the "screen geography" way.
There's no way to tell which method is better, because they apply better in different situations, however the "screen geography" method usually ends up being the worst, because it's the most labor-intensive and requires from the author to dedicate a lot of attention to achieve precision (i.e. move exactly N spaces to the left and then exactly M spaces down is very easy to get wrong, also, with larger N and M becomes really tedious).
Navigation by word is only slightly better than navigation by character, and often falls into the "screen geography" kind of navigation. It's easy to learn, it's quite universal and doesn't require understanding of the structure of the program or mastering better techniques (eg. "jedi jump"). That's not to say that it should be excluded from the arsenal -- quite the opposite, but a master programmer (in the sense of someone who writes programs masterfully) would be the one who's less reliant on this kind of navigation.
No. That's a wrong analogy. There's no way around having to navigate the text of the program back and forth, by character, by word, by statement, by definition and so on. This is bread and butter of people who write code.
If you complain about doing this, this is because you don't know how to perform the basic functions necessary to write code. Heuristically, this is because you are either using a bad editor or didn't learn how to use a decent one.
I.e. your complaint is more comparable to Amazon reviews coming from people who don't know how to use the product and then write something asinine, like that one about a loo brush that feels too rough when used in the capacity of toilet paper (though I believe that one was actually a joke inspired by similarly stupid but less funny reviews).
With python I'd decompose that one-liner into several variables for readability. That probably ends up using more memory than it would otherwise but I generally don't work on systems where that matters much.
Scala was really nice for this syntax when I used it for Spark.
Map and filter don't actually consume anything until they're used later, they produce iterables. So if you pulled them into their own lines they wouldn't consume (much) extra memory. Taking the original:
There is a niche use-case for the reverse order `(foo min map filter baz bar)`, which is, solving typed holes (you could refine the hole as like `_.foo()` although that wouldn't be interoperable with things like next token prediction).
But that's more of a math thing than an everyday coding thing, where dot chaining usually reads nicer.
Your point about ordering and readability really rang true for me. My way around this in Python is to separate the map and the reduce: do the map in one part with a list comprehension and the reduce in a second part on a new line.
I’ll wrap the whole thing in a named function as a way of describing what I’m doing and make it a closure if it’s used only once:
def f(bar):
def smallest_baz():
bazs = (
some_op(b)
for b in bar.baz()
if some_filter(b)
)
return min(bazs)
return smallest_baz().foo()
it's interesting I completely agree with you and it's a big reason I find Python irritating to write (compared to Groovy, Kotlin, Ruby, etc). However there do seem to be a lot of people that dislike this method chaining style and will assert that functional style is better in every way. But I just can't fundamentally agree that writing these as functions is as readable.
Even if you go far out of your way to format it similarly, it still forces
you to do a lot of mental work to see the inner most starting point and then
deduce what the sequence of operations that happens is backwards, eg:
Agree to a big extent. Rust has lots of methods, because their traits work best or most habitually with methods. So I see a comparison of Rust x.min(y) vs Python min(x, y).
The Rust x.min(y) to me is so asymmetric. min(x, y) conveys the symmetry of the operation much better, x and y are both just elements. (And the latter is how it can be used in Python. In Rust, you can call Ord::min(x, y) to get the symmetry back, but it is less favoured right now for some reason.)
I would not recommend the default arguments hack. Any decent linter or IDE will flag that as an error and complain about the default argument being mutable (in fact, mutable default arguments are the target of many beginner-level interview questions). It's much easier to decorate a function with `functools.cache` to achieve the same result.
Or, if you need a "static" variable for other purposes, the usual alternative is to just use a global variable, but if for some reason you can't (or you don't want to) you can use the function itself!
def f():
if not hasattr(f, "counter"):
f.counter = 0
f.counter += 1
return f.counter
print(f(),f(),f())
> 1 2 3
even 0 = true
even n = odd n-1
odd 0 = false
odd n = even n-1
I fed a C version of this (with unsigned n to keep the nasal daemons at bay) to clang and observed that it somehow manages to see through the mutual recursion, generating code that doesn't recurse or loop.
In Python you'd maybe think, smart, then my counter is a fast local variable. But you look up (slow) the builtin hasattr and the module global f anyway to get at it. :)
I looked at python dis output before writing this, you can look at how it specializes in 3.11. But there's also 4 occurences of LOAD_GLOBAL f in the disassembly of this function, all four self-references to f go through module globals, which shows the kind of "slow" indirections Python code struggles with (and can still be optimized, maybe?)
You could scratch your head and wonder why even inside itself, why is the reference to the function itself going through globals? In the case of a decorated or otherwise monkeypatched function, it has to still refer to the same name.
I tend to dislike this method as it's unclear what or returns unless you already know that or behaves this way. x if x is not None else default is cleaner in my opinion
When you set an object as a default that object is the default for all calls to that function/method. This also holds true if you create the object, like that empty list. So in this case, every call that uses the default argument is using the same list.
I would hate to get an interview question where the very premise of it is wrong. Python does have mutable arguments, but so does Ruby.
def func(arr=[])
# Look ma we mutated it.
arr.append 1
puts arr
end
Why calling this function a few times outputs [1], [1],... instead of [1], [1, 1],... isn't because Ruby somehow made the array immutable and hid it with copy-on-write or anything like that. It's because Ruby, unlike Python, has default expressions instead of default values. Whenever the default it needed Ruby reevaluates the expression in the scope of the function definition and assigns the result to the argument. If your default expression always returned the same object you would fall
into the same trap as Python.
The sibling comment is wrong too -- it is a local variable, or as much one as Python can have since all variables, local or not, are names.
Default arguments are evaluated and created when the function definition is evaluated, not when the function itself is evaluated. This means that the scope of the default argument is actually the entire module, not just a single invocation of the method. This is what throws people off.
Using "yield" instead of "return" turns the function into a coroutine. This is useful in all sorts of cases and works very well with the itertools module of the standard library.
One of my favorite examples: a very concise snippet of code that generates all primes:
def primes():
ps = defaultdict(list)
for i in count(2):
if i not in ps:
yield i
ps[i**2].append(i)
else:
for n in ps[i]:
ps[i + (n if n == 2 else 2*n)].append(n)
del ps[i]
If you don't know it already, it is really worth looking into. I am a python dev with nearly a decade of experience and I knew generators, and yet this was still an eye opener.
Note that despite this being a python-specific slide deck, generators and iterators are also present in many other languages, including but not limited to Rust and JS.
The concepts matter more than the chosen language in this deck.
I learned a lot! Looks like I can apply this to a PHP trace/profile parser project, especially the pipelined parsing and the query language idea.
And don't forget "yield from" (same as yielding all values in a list, but keeps the original generator! You can send data back to the list if it is itself another generator!)
Anyone have good examples of how/when to actually use this? I've personally never interacted with or written a generator that expects to receive values.
Wound up writing a recursive generator (with some help from #python on IRC):
def flatten(items):
for item in items:
yield {k:v for k,v in item.items() if k != 'children'}
if 'children' in item:
yield from flatten(item['children'])
Thanks for the example, but I was more looking for something that uses "generator.send(...)". I definitely agree that yielding items out of generators is extremely useful, but not so sure on examples of generators that are sent values.
This is the basis of most older async frameworks (see: Tornado, Twisted). A while ago I put together a short talk on how to go from this feature -> a very basic version of Twisted's @inline_callback decorator.
Anything with feedback control. Updating a priority queue's weights, adaptive caching, adaptive request limiting, etc. Ironically it looks like HN itself rate limited me the first time I tried to reply lol
I like using generators when querying APIs that paginate results. It's an easy way to abstract away the pagination for your caller.
def get_api_results(query):
params = { "next_token": None }
while True:
response = requests.get(URL, params=params)
json = response.json()
yield from json["results"]
if json["next_token"] is None:
return
params["next_token"] = json["next_token"]
for result in get_api_results(QUERY):
process_result(result) # No need to worry about pagination
Thanks! I tried to add mostly the stuff I don't encounter that often in blogs/tutorials etc. But guess you are right. Generators, or at least the 'yield' keyword, is often misunderstood, and we can't emphasize them enough
Just to clarify, I don't mean your article is bad or incomplete -- quite the contrary, I enjoyed it a lot. Generators are one of my favorite Python features and they're kind of underused, mostly because people simply don't know about them.
A couple more along the same lines:
- Metaclasses and type. (This is admittedly dark magic, but useful in library code, less so in application code)
Thanks a lot! Really appreciate it. Love the example! Haven't used the dunder __call__ yet (like many magic methods I guess), but that's a nice one!
I didn't have to use Metaclasses, either, though I have read about them, especially in Fluent Python. But I guess I belong to the 99% who haven't had to worry about them, yet :P
If an object is callable you can use it in places that might conventionally expect functions. The utility of that is very situational, though. I've only used it a handful of times myself over the years I've known and used Python.
It may also give you a "clearer" (in quotes because subjective) presentation for something you're trying to do.
I see it a lot in HuggingFace, and use it myself for classes that are used like a function, especially when the obvious method name is the verb form of the class name
processor = SomeProcessor.load("path/to/config")
# with __call__
processed_inputs = processor(inputs)
# less awkward than
processes_inputs = processor.process(inputs)
The only benefit is to the human, same as @property or even @dataclass.
Thanks for writing that up! I disagree though, I prefer the processor.process for clarity, and for not adding another way of doing things that regular methods already do.
I think I figured out that count(2) is from itertools? I'm new to python.
I think you could simplify the rest like so:
def primesHN():
from collections import defaultdict
from itertools import count
yield(2)
ps = defaultdict(list)
for i in count(3,2):
if i not in ps:
yield(i)
ps[i**2].append(2*i)
else:
for n in ps.pop(i):
ps[i + n].append(n)
> I think I figured out that count(2) is from itertools?
It is. Itertools is a masterpiece of a module. It has a lot of functions that operate on iterators and will work both on standard iterables (lists, tuples, dicts, range(), count() etc.) and on your own generators. It forms a sort of "iterator algebra" that makes working with them very easy.
> I think you could simplify the rest like so:
Sounds good, but with a caveat: you do need to call "del" at the end for memory deallocation purposes. The garbage collector isn't smart enough to know you won't be using those dictionary entries any longer. Technically the code still works, but keeping everything in memory defeats the purpose of writing a generator.
> can you explain how generators work with multiprocess
The best way to think of a generator is as an object implementing the iteration protocol. They don't really interact with concurrency, as far as multiprocess is concerned, they're just regular objects. So the answer is that it depends on how you plan to share memory between the processes.
> is ps internal variable unique for each Thread or same?
ps is local to the generator instance.
def f():
x = 0
while True:
yield (x := x + 1)
>>> f()
<generator object f at 0x10412e500>
>>> x = f()
>>> y = f()
>>> next(x)
1
>>> next(x)
2
>>> next(y)
1
> is it safe to execute your primes() from different threads?
For this specific generator, you would run into the GIL. More generally, if you're talking about non CPU-bound operations, you need to synchronize the threads. It's worth looking into asyncio for those use cases.
A yield will simply return a generator object, which contains information about the next value to use, and how to continue the function execution. That's why you need to use functions that yield things inside loops or list(...).
If you run it from different threads I guess it will be the same as calling the function multiple times, it will return a new started-from-the-top generator.
In this example, calling sum() creates a generator and returns it. Say g = sum(). If you share g between threads, they will all use the same generator object! If you call sum() separately per thread, they will be different generators.
If you try to send g to a different process, you will get an error, because it doesn't serialize.
I know it's a really minor point, but in a blog post about Python (rather than just one that is using Python), it kind of bothers me to see "non-Pythonic" code style,
As someone learning Python, but having worked with other languages, I think your second example is better as it reads more like English. I think that simplicity actually ends up much more rewarding when it comes to reading code.
Agree. Using "not in" can also theoretically make certain checks faster (e.g. testing negative presence in a hash-based data structure can bail out without walking the collision chain if the initial hashed location does not have an element).
- none of these functionalities are "overlooked", this is pretty basic python
- for fibonacci you have a decorator for memoization (functools cache / lru_cache)
- you don't need to use parenthesis for a single line "if"
You are very much right a lot of it is pretty basic knowledge. From my experience though, a lot of python developers don't take the python docs or tutorial as first resource, and quite some developers I met did lack quite some knowledge I mentioned in the article.
You are right about the fibonacci operator, I thought I did refer to another article where I mention the lru_cache as well :) But I'll double check.
Good one about the parenthesis! I'll post an update soon
At the point we're disagreeing about 'basic' vs. 'bit below intermediate'.. idk we at least have to agree how many levels the model has.
Fwiw I also thought it was pretty regular stuff, and then arcane library functions you've either needed or you haven't. Also, that's a generator, not a list comprehension.
One man's basic is another man's low intermediate.... but I agree that none of these seem overlooked to me,. They are pretty basic things once you get past the few chapters of your first python book.
Even though most of the stuff OP writes about is worthless, I see it used a lot (if it's old) and less so, but still enough otherwise...
This article reads to me like as if it was written by someone learning Python, perhaps in their 3'rd-4'th month, when they finally decided to open documentation / some existing project code instead of implementing calculators and animal class hierarchies...
That is definitely a neat one! If you are ok with it, I might add that one. I just updated the article already with some of the great comments and tips I recieved over here.
> The underscore _ can be used as a throwaway variable to discard unwanted values:
So can any other variable, using underscore is just a convention to make it obvious that you're not planning to re-use it (it doesn't get GCed more aggressively or anything).
Similarly, private methods being prefixed with an underscore is also just a convention, you can access them from anywhere.
However, double underscores are used for magic attributes and name mangling for class attributes, which are interpreted differently! (See: https://stackoverflow.com/a/1301369)
I don't think the point about splat unpacking really is a correction. Unpacking always requires that the iterable has enough values to assign to the specified variables, this has nothing to do with the use of *middle.
Probably one of the benefits I gained from writing JavaScript before ES5 (although have worked with many languages, I've only used a few that were dynamic - PHP, JS, and old VB). I write my functions as early as possible, having remembered hoisting rules from JavaScript (and trying to only rely on OOP with Python where it naturally makes sense).
Looking at your Julia example, this seems much more friendly and less surprise and error-prone.
> Python arguments are evaluated when the function definition is encountered. Good to remember that!
I would never try to exploit this behavior to achieve some kind of benefit (avoiding max recursion). Any tricks you try to do with this is almost definitely going to to cause bugs that are very difficult to track down. So don't be too clever here.
Yeah. I was really surprised to see this as a feature to be used rather than a gotcha. I've seen it more as gotchas, as in actual bugs introduced because of this behavior, and never as a feature until now. I can see why he thinks it's useful though and, maybe within his specific context, it is. That said, even for his example, I think he would have been better off using https://docs.python.org/3/library/functools.html#functools.c...
You are right about that, perhaps it is good to mention it as "gotcha". Or I could have used a better title. I do think though, it is good practice to know this stuff. About the cache decorator: I did link to another article where I discuss lru_cache and cache :)
Before I saw your comment, I had "overlooked" that these were presented as beneficial features, rather than just curiosities. As someone just learning Python, but familiar with other languages, I can only hope that if I start using Python in production with other developers they take the most obvious route (or use a comment as to why they would be relying on this type of behavior).
I chose to learn Python because it seemed to be the easiest to read, which to my mind meant working in a team would lead to easier discovery and understanding. Then I see articles like this, and wonder if I'll have a lot of footguns to watch out for where the code isn't as clear as it seems.
I fully agree on this! Like the Zen of Python says: explicit is better than implicit. You should not expect your teammates know all "special" behavior of the language and if you can write it more straightforward, you should in that case. Guess Kyle mentiones the same about JavaScript in his books. You are right about that. Perhaps that should be a nice addition to the post. I do believe though, it is good to be familiar with this behavior in case you ever come accross such situation.
- `repr` often outputs valid source code that evaluates to the object, including in the post's example: running `datetime.datetime(2023, 7, 20, 15, 30, 0, 123456)` would give you a `datetime.datetime` object equivalent to `today`.
- Using `_` for throwaway variables is merely a convention and not built into the language in any way (unlike in Haskell, say).
> Because the language is so easy to learn, many practitioners only scratch the surface of its full potential, neglecting to delve into the more advanced and powerful aspects of the language which makes it so truly unique and powerful
We have definitely found this to be true in hiring. Many people’s Python knowledge seems to just be surface deep.
Some of these features lie in the border of the uncanny valley where languages like Ruby and "vanilla Javascript" live, and are not compatible with the principle of least surprise or even the Zen of Python. I don't write too much Python anymore, but when I do I keep it simple and explicit.
I find a lot of python like that. It's a simple language to get started in but an incredibly complex language to try and get across more than skin deep. Maybe not C++ complex but more than I expected.
It has some wild features and crazy syntax and if you know it, it's probably awesome, but I too like to keep it mostly simple and obvious.
I agree. Someone else here also mentioned that they prefer code that is easy to read over code that uses a lot of "unfamiliar functionality," let's call it that. And I do agree; Kyle mentions the same thing if I remember correctly when it comes down to JavaScript. It is better not to expect your colleagues or other developers to know the ins and outs of the language as well. If one way is 10x easier to understand, just stick with that.
But as you said: if you know it, it's probably awesome. In my opinion, it never gets boring to discover new things in Python, and it does make you a better Python developer. Knowing what and when to apply certain knowledge is where your experience comes in.
I'd argue that your original approach is actually better than your new approach.
Using a list comprehension, such as your original approach, is pretty easily understood by anyone writing python and is easy to follow, it is also quite terse.
Your recursive unpacking zip thing is much harder to understand and read. This reminds me of the type of stuff you find in the codebase years later when the person who wrote it is long gone and you find a comment next to it that says:
# No idea why this works, but don't touch it
One of the problems I have with python is that there are a million super creative ways to do stuff, especially using less known parts of the language. People love to get super creative with it, but usually the simplest solution is actually the best one, especially when working on a team.
In your example above, you aren't even saving any real space. Both approaches can be done inline, the list comprehension is maybe a few extra characters. You're not really saving anything, just making it harder to read and maintain by others.
When I moved from a company that wrote in Python to one that wrote in Golang, I found that the restrictions that Golang offers is a huge benefit in a team. Because you don't have access to all these crazy language components that python has, the code written in Go would be almost identical regardless of who wrote it. Of course everything in Golang is far far more verbose than Python, but I actually found it 100x more maintainable.
In the python codebase it was very easy to tell who wrote different parts of a codebase without looking at the git blame, because there was almost a "voice" with the style of writing python. But in Golang it was more restrictive which meant that the entire codebase was more cohesive and easily to jump around.
The need actually comes up a lot to transpose a list of lists. That zip can do it is not hard to visualize and it's an idiom worth learning. If it still seems unclear, you can name things to help:
But anyway, yeah, tastes differ, it's fine if we disagree. I do agree that Python has gotten uncomfortably complex. But this is a very old feature from simpler times and does not add any syntax or metaprogramming features, it's just an already needed function.
Not saying I disagree with you, but I do want to note that the specific example of unzipping a list using `zip` has been in the official zip docs [1] as long as I can remember, and as such, should be commonly understood by Python developers.
Numpy has a lot of these shortcuts that are quite opaque. For example np.r_ and np.c_
This one can be explained as "equivalent to np.linspace(-1, 1, 5)", i.e 5 evenly spaced points between -1 and 1. Normally the step size is an integer but with a complex "step" it switches the meaning from step size to number of equidistant points.
Repr prints source code that will (often) give you an equivalent object. I would be highly surprised if it got you
The same object instance. == but not ===
import random
some_value = 9 # return a number between 0 and, including, 100
if below_ten := some_value < 10:
print(f"{below_ten}, some_value is smaller than 10")
Random isn't used in this function, but more importantly why would you assign the value to below_ten if the point is to just print it, why not just print some_value?
Even in the next example of the walrus operator - it is extremely contrived:
if result := some_method(): # If result is not Falsy
print(result)
I've been coding for most of my life and I can't believe some people would choose some of these tricks when python has much simpler syntax for most of these.
The first, *_, last trick for example would be particularly obnoxious to encounter. The first element is my_list[0], last is my_list[-1]. Dead simple, way easier to understand at a glance.