This oft-repeated quote is always relevant on Python metaclass posts:
“[Metaclasses] are deeper magic than 99% of users should ever worry about. If you wonder whether you need them, you don’t (the people who actually need them know with certainty that they need them, and don’t need an explanation about why).”
Tim Peters, Inventor of the timsort algorithm and prolific Python contributor
I've run across people in three different companies who decided to use them for no good reason at all.
I was only around for one of them while he did it but it seemed pretty clear that he was insecure and trying to prove that he was a more experienced programmer by using the most advanced features.
Yes very much this. I've worked on two projects in the last year that used metaclasses. Neither needed them in the least and they simply added noise and complication.
In one case I think the author did believe he needed them, or was going to. He didn't, but I think he believed it was the best way to go about the inheritance he was doing.
The other case was exactly what you say. The person wanted to show how advanced their knowledge of Python was. The funny part about the second case (which included every advanced or new Python feature under the sun, including the walrus operator shoehorned in for good measure) the code simply didn't work at all. It ran but was wrong. The author was so busy demoing his advanced knowledge he forgot to make it work. And was unable to debug it in a timely fashion due to his overuse of abstractions and features. So less "advanced" me was brought in to make it work. Which I did.
This is a cool overview, and I certainly learned new things about the Python language from it. Thanks for posting!
We do lots of Python metaprogramming at Mito [1], but generally avoid all of this fancy Python fluff to get it done. Specifically, we avoid metaclasses, invisible decorators, etc. Instead, we take a much simpler approach of having Python code that literally populates as template .py file, and then writes it to the correct location in our codebase.
As a concrete example: we’re a spreadsheet, so we let our users transform data in a bunch of different ways - adding a column, writing a formula, deleting some rows. Anytime I want to add a new transform (say, encoding a column), I tell the metaprogramming package “python -m metaprogramming step —name “Encoding A Column”. It will ask me some questions about the parameters and types of those parameters, and then write most of the 4-6 boilerplate Python and Typescript files I need automatically! You can see it here [2].
This is still metaprogramming (it’s certainly code that writes code). But the code you end up with at the end of the day is very simple Python code that is extremely easy to understand / maintain long-term.
I’ll pass on the fancy stuff for now. Thanks though!
I personally consider code generation an anti-pattern in Python. With its dynamic nature and the two-step execution model, Python is essentially its own macro language, and any generation can be done at runtime.
unfortunately pep-484 changed all that. now you have to use static generation for basically anything that used to be at runtime previously, if you want it to have any kind of compatibility with mypy, pylance, etc.
when people using your library need pylance type hinting to show up for all your public methods, and those public methods are in fact generated as proxies for some other object. This was trivial as a dynamic runtime thing before, now must be statically generated. I can share examples if you'd like.
Like let's say you want to add a new required property to every type of something... like you want every encoding to have an accepts_tainted_value property. You don't want a default because you want to be forced to think through every case.
Can you regenerate the code? If so, are you keeping the "real" object specification in some other structure? Or do you just change a superclass to make the property required then fix the generated source based on the errors you get?
Thanks for sharing! Instead of using meta classes for subclass registration though, please just use __subclass_init__ instead. It’ll often be simpler unless you have a complex use case.
Yes, also for anything else than subclass registration, there are things like the descriptor protocol or class decorators that are likely to be easier to use, debug and understand.
And metaclasses can be any callable, not just class. It's usually better to use a simple function.
Python is easy, not because it prevent people from doing complicated things, but because with only a few simple things, you are productive.
Most python programmer use 10% of the language, the same 10% that is described in most tutorial, because that's enough, and they don't even know the rest exist.
This makes for a very smooth, but long learning curve which allow you to enjoy python in the early years, yet keep getting a kick out of it after 15 years.
I hear a lot of people raising the same kind of concern you do in comments, but in the field, I never do with people actually using the language.
That "smooth learning curve" applies when you are learning the language, use it by yourself or work in the ML/Data science industry.
Once you get thrown in a big project that makes heavy use of type hinting (plus the whole environment of it, mypy et all), Object Oriented design and all those hidden things...you realize how much you really don't know.
And that 10% is way too small. To be productive in Python, you need to know at least 50%, or you will be reinventing a lot of existing things. Poorly.
Using a different language isn't going to stop the vertigo of being thrown into the deep end of an existing project without a mentor to guide you through it.
If you make a big project, you will have to use that many things with every langagues. If the language is rich, you'll learn the language. If not, you'll learn the project patterns. It's not specific to python.
But 99.9999999% projects won't implement metaclasses.
Most won't event implement decorators, context managers or generators. Use them, sure.
The REAL power comes from the libraries. In the Olden Days, we would call these 'subroutines' -- libraries contain the gussied-up subroutines that make it possible to do amazing things in Python with that "10% of the language" we use.
I read your "vodoo magic in a programming language" to mean "behind the scenes in the implementation of the programming language", not "for a programmer to understand while using the programming language".
Ran into this face-first, just yesterday. I'm converting a rather old, rather big API (~400 endpoints) from Python 2.7 into 3, and apparently in Python 2, _hasattr(something, someattribute)_ just returns False if attribute access throws an exception! Specifically, if you have
class A():
@property
def a(self):
raise Exception('go away')
a = A()
then hasattr(a, 'a') will return False in Python 2.7 (and throw as expected in Python3)
A true WTF moment, the tests and the API have been slightly broken for years, without anyone noticing.
This is a rather unusual problem and I would call this design flawed. Defining useless methods that only exist to raise an exception is in my opinion a waste of space, both virtual and textual.
obv, this is just a minimal demo (i don't name my classes A!); in actual codebase, there's a complicated calculation that throws up under some circumstances
What an odd thing to say. This article is not for beginner Python programmers, and you don’t need to know anything in this article to use Python. I know people who refuse to use decorators because they seem difficult, and yet they are able to make very useful applications.
I’ve been writing Python professionally for over 20 years, and I’ve needed to use the things here once. It was highly encapsulated inside one module of an ORM-type project, where other code wanted to use it without having to know anything about its implementation details.
You can do these things in the extremely rare cases that you must, but other than that you shouldn’t.
I’m surprised to hear your opinion on this. I have around 12 years of python experience and I find metaprogramming to be, by far, the most important part of Python. Without the extensive ability to rewrite underlying functionality in a way that was approachable to both novice and adept users, I don’t think we’d have seen a widespread adoption of the language to begin with.
I guess I don’t often touch metaclass either, although after reading this article it gave me some ideas on how I might better implement object validation patterns. That being said, I also have recently become acquainted with pydantic, which does take care of some of that.
Pydantic’s a great example of a good use of metaclasses. I’ve seen them abused in places where there were far simpler ways to accomplish the same goal, and promptly ripped that code out.
For a language that is actually "easy to learn and use", look at Lua[1].
It's so easy to learn & use that children manage to be productive with it (the Roblox community).
It's still powerful though. With few enough primitives to remain simple & understandable, it has just enough primitives to build anything. People have written web frameworks, window managers and video games in Lua.
It's got less pitfalls, "voodoo magic" and advanced weirdness than any other language I know. You can learn 90% of the language from its Wikipedia page. Every time I encounter some of Python's internal weirdness, I wish Lua had been the scripting language of the 2000s instead. Unfortunately Python has a vast collection of libraries available that keep us using it instead.
That's not uncommon, when you make something easy* to learn and use you obfuscate certain things that are more "advanced" to streamline the onboarding experience, and then when you want to break out of the streamlining you gotta learn to circumvent those "stops" somewhat, which often end up being pretty non ergonomic. It compounds as well because the ones developing the language will have a bias towards what it currently offers and the "unorthodox" side keeps getting set aside.
*and by easy there I mean streamlined and more intuitive on most expected "common" tasks than the alternatives
Most Python programmers never need to worry about the "voodoo magic". But for writing tools and libraries that can nicely encapsulate a desired behavior and make it simple for other programmers, Python's "voodoo magic" is great.
Python is an amazing language. It's basics are very easy to learn and I'd like to think it doesn't feel like a incomprehensible sorcery to the beginner.
But even for that, it's not a toy language as clearly evident by this example.
This is why any CS curriculum worth its salt should provide a summary of / introduction to Lisp and Smalltalk at or very near the start, so that in the second and third year of study or shortly after graduation, any encounters with so-called "voodoo" will elicit an "ah yes, sounds like CLOS MOP" response or something similar, rather than wide-eyed excitement leading to uninformed evangelism.
(Mine didn't have it, and I've been ruing it, and compensating for it, ever since.)
Python is powerful and flexible enough that you don't need metaprogramming.
(I mean this quite literally: I sincerely doubt that there any code in Python using metaclasses etc., that wouldn't be more clear and maintainable if rewritten in "plain old" Python without them.)
(With the caveat that I'm not including "art" projects, I'm talking about working production code.)
(In case it's not clear, this is one of those "Prove Me Wrong" scenarios... If you think you have a counter-example to my claim, please don't keep it to yourself, "Shout it out so the whole theatre can hear you!")
The example shows that django models uses metaclasses heavily.
So I'm assuming if you needed to extend django models, you'd have to do it via metaclasses.
Too easy: Django meta-programming is one of the things that has bitten IRL. A junior dev was pulling hair out one day trying to subclass a Django HTML form object. Couldn't do it because of Django cowboy metaclass. Had to wrap it in a "factory" and modify it before returning it.
Django using metaclasses is exactly the problem I'm warning of...
You don't need metaclasses for data models nor for HTML forms. Nothing Django does requires metaclasses, and it's actively beginner-unfriendly to use them.
It's interesting how differently two similar dynlangs, Python and Ruby, decided to offer meta-programming. Ruby decided to offer a very simple way of evaluating code at construction, but python has you define custom metaclasses. One sees it as a routine way of extending the language, and the other treats it as a 'black-art'.
Metaprogramming done wrong, no matter the mechanism, is very painful. So, I'm not sure which way is better.
It’s interesting that meta programming, one of the main selling points of lisps, is actually pretty common and possible in other languages as well. Lisp’s “code = data” achieves meta programming in a straightforward way, but this link shows that it’s not necessary. In fact, Python’s way might even be better because the language sorta “gets outta the way” because it’s really darn simple.
The article however, does not show anything comparable to lispy metaprogramming.
I do not see a single new keyword being defined. I do not see anything changing the order of evaluation. Only to give 2 examples.
Recently I implemented a new kind of "define" in a Scheme, which allows for specifying contracts for a function. I don't think such a thing is possible using the tools shown in the article. As such, the things shown in the article are not really that meta, but rather parts of the Python language already. It is not like it is adding anything to Python, which was not there. It is just usage of some concepts Python already has. This is conceptually different from transforming source code to other source code, which then creates a new concept in the language itself. It is not like I can mold Python into whatever I want, but rather stay in the corset of the language facilities. Which of course is not as nice as lispy languages with good macro system, when it gets to creating DSLs and other conveniences.
Python recently got a form of structural pattern matching.
Can that be removed from Python, and then reimplemented only in Python? I mean other than by writing a .py file to .py file text filter?
Can it be backported to a prior version of Python?
Python's release history is full of "can't use this syntax if you're not on at least x.y.z version"; it continuously proves that it has it doesn't have metaprogramming on a level that could be used to develop the language.
When the purveyors of Python decide to add some new syntax and semantics, they eschew Python metaprogramming and dive straight into C.
The SaltStack module starts out empty except for a function that runs at module load time. The initialization function queries PowerShell for all AD FS-related cmdlets and creates Python wrapper functions for them. It even copies the cmdlet's help to the function's docstring.
As someone who just dabbles in python for small scripts, this is fascinating to me.
You can use some metaprogramming to create very clean interface points in python!
I always wondered how django did so much with very clean readable implementations for end users.
“[Metaclasses] are deeper magic than 99% of users should ever worry about. If you wonder whether you need them, you don’t (the people who actually need them know with certainty that they need them, and don’t need an explanation about why).”
Tim Peters, Inventor of the timsort algorithm and prolific Python contributor
https://www.oreilly.com/library/view/fluent-python/978149194...
https://en.m.wikipedia.org/wiki/Tim_Peters_(software_enginee...
I would then also concur with the other comment that if you “know” you need metaclasses, 99% of the time actually you only need __subclass_init__.
A lot of online literature about Python meta programming misses out __subclass_init__ as it was only added to Python 3.6 in 2015 via PEP 487.
https://peps.python.org/pep-0487/