Sometimes, even veterans can be stumped by their language of choice. And not even by the newest, latest additions, but the core language itself. This is such a story, about an obscure corner of Python.
It starts with a legacy1 project
Legacy as in ‘still not fully migrated from Python 2.7’ – although our topic is only valid in Python 3. There was this module which anonymized a database: overwrote real names, set passwords to random strings, deleted transaction data, etc. It was split into separate responsibilities quite well, with each aspect in a separate method (even if there was some evident copy-pasting), in a common class. Another class was used as configuration for that one, pulling in database config and exposing some methods to help select objects to clean up.
On opening that class, mypy
fired up: Method must have at least one argument
.
What? How? Well, surely it does have one? 🙀
The code
What follows is not the original code, but an example with the same issues.
Surely this is invalid
After all, instance methods always take a self
as their first argument. It’s mandatory, unless we attach the @staticmethod
decorator, which makes that method callable both on the class and an instance. If we instead want it to be invoked on the class, not instance, we use @classmethod
– but even then it takes at least one argument (canonically named either cls
or klass
).
At this point I asked my co-worker whether the code works for him. ‘No issues’, he answered. Still confused, I went on to add the missing invocations so that the type checker will stop drawing squiggly lines at me.
This turned out not to be a good idea, as the code is invoked statically, as in Config.remove_django_session_dates()
. And it does work without @staticmethod
or @classmethod
. But how?
Niche, but valid
Turns out this is a consequence of how Python’s method calls work. To quote from the docs:
If you access a method (a function defined in a class namespace) through an instance, you get a special object: a bound method (also called instance method) object. When called, it will add the self argument to the argument list.
What this tells us is that self
is not a syntax concept. Nor is it a mandatory argument. The name is canonical, but the first argument may actually have any name you like – even this
if you want a Java-like feeling. It only exists when calling through the instance: Config().remove_django_session_dates()
indeed fails with an ArgumentError
.
Therefore, if we define a method like this, then never call it on an instance, it’s perfectly fine! It’s now like half a @staticmethod
: works on the class, not the instance, and half a @classmethod
: does not receive the class itself as an argument.
But only in Python 3
Let’s look closely at how Python 2.7 and 3.7 (or later) handle this:
Unbound methods were never2 in Python 3. In Python 2 they still require an instance to be passed as the first argument, so our half-@staticmethod
trick does not work. But in Python 3 they are replaced by regular functions, which do not place such constraints.
It’s valid, but is it legal?
I can think of some reasons to write functions like that. The most important one would be using classes as namespaces (instead of modules or other choices), which isn’t strictly recommended, but is not invalid either. But I’d stay away from this construct: even if allowed, it looks invalid, it’s flagged by mypy
(no objections from pylint
, though!), and has a confusing behavior where it fails on an instance. The proper way to do this thing is either with a @staticmethod
or @classmethod
.