Originally published in my blog: https://sobolevn.me/2021/06/typeclasses-in-python
Today I am going to introduce a new concept for Python developers: typeclasses.
It is a concept behind our new dry-python
library called classes
.
I will tell you in advance, that it will look very familiar to what you already know and possibly even use. Moreover, we reuse a lot of existing code from Python's standard library. So, you can call this approach "native" and "pythonic". And it is still going to be interesting: I am showing examples in 4 different languages!
But, before discussing typeclasses themselves, let's discuss what problem they do solve.
Some functions must behave differently
Ok, this one is a familiar problem to all of the devs out there.
How can we write a function that will behave differently for different types?
Let's create an example. We want to greet
different types differently (yes, "hello world" examples, here we go).
We want to greet
:
-
str
instances asHello, {string_content}!
-
MyUser
instances asHello again, {username}
Note, that greet
as a simple example does not really make much "business" sense, but more complicated things like to_json
, from_json
, to_sql
, from_sql
, and to_binary
do make a lot of sense and can be found in almost any project.
But, for the sake of implementation simplicity, I'm going to stick to our greet
example.
The first approach that comes to our minds is to use isinstance()
checks inside the function itself.
And it can work in some cases! The only requirement is that we must know all the types we will work with in advance.
Here's how it would look like:
@dataclass
class MyUser(object):
name: str
def greet(instance: str | MyUser) -> str:
if isinstance(instance, str):
return 'Hello, "{0}"!'.format(instance)
elif isinstance(instance, MyUser):
return 'Hello again, {0}'.format(instance.name)
raise NotImplementedError(
'Cannot greet "{0}" type'.format(type(instance)),
)
The main limitation is that we cannot extend this function for other type easily (we can use wrapper function, but I consiser this a redefinition).
But, in some cases - isinstance
won't be enough, because we need extendability. We need to support other types, which are unknown in advance.
Our users might need to greet
their custom types.
And that's the part where things begin to get interesting.
All programming languages address this problem differently.
Let's start with Python's traditional OOP approach.
OOP extendability and over-abstraction problems
So, how does Python solve this problem?
We all know that Python has magic methods for some builtin functions like len()
and __len__
, it solves exactly the same problem.
Let's say we want to greet a user:
@dataclass
class MyUser(object):
name: str
def greet(self) -> str:
return 'Hello again, {0}'.format(self.name)
You can use this method directly or you can create a helper with typing.Protocol
:
from typing_extensions import Protocol
class CanGreet(Protocol):
def greet(self) -> str:
"""
It will match any object that has the ``greet`` method.
Mypy will also check that ``greet`` must return ``str``.
"""
def greet(instance: CanGreet) -> str:
return instance.greet()
And then we can use it:
print(greet(MyUser(name='example')))
# Hello again, example
So, it works? Not really.
There are several problems.
First, some classes do not want to know some details about themselves to maintain abstraction integrity.
For example:
class Person(object):
def become_friends(self, friend: 'Person') -> None:
...
def is_friend_of(self, person: 'Person') -> bool:
...
def get_pets(self) -> Sequence['Pet']:
...
Does this Person
(pun intended) deserve to know that some to_json
conversion exists that can turn this poor Person
into textual data? What about binary pickling?
Of course not, these details should not be added to a business-level abstraction, this is called a leaky abstraction when you do otherwise.
Moreover, I think that mixing structure and behavior into a single abstraction is bad. Why? Because you cannot tell in advance what behavior you would need from a given structure.
For abstractions on this level, it is way easier to have behavior near the structure, not inside it. Mixing these two only makes sense when we work on a higher level like services or processes.
Second, it only works for custom types. Existing types are hard to extend.
For example, how would you add the greet
method to the str
type?
You can create str
subtype with greet
method in it:
class MyStr(str):
def greet(self) -> str:
return 'Hello, {0}!'.format(self)
But, this would require a change in our usage:
print(greet(MyStr('world')))
# Hello, world!
print(greet('world'))
# fails with TypeError
Monkey-patching
Some might suggest that we can just insert the needed methods directly into an object / type.
Some dynamically typed languages went on this path: JavaScript
(in 2000s and early 2010s, mostly popularized by jQuery
plugins) and Ruby
(still happening right now). Here's how it looks:
String.prototype.greet = function (string) {
return `Hello, ${string}!`
}
It is quite obvious, that it is not going to work for anything complex. Why?
- Different parts of your program might use monkey-patching of methods with the same name, but with different functionality. And nothing will work
- It is hard to read because the original source does not contain the patched method and the patching location might be hidden deeply in other files
- It is hard to type, for example,
mypy
does not support it at all - Python community is not used to this style, it would be rather hard to persuade them to write their code like this (and that's a good thing!)
I hope that it is clear: we won't fall into this trap. Let's consider another alternative.
Extra abstractions
People familiar with things like django-rest-framework
might recommend to add special abstractions to greet
different types:
import abc
from typing import Generic, TypeVar
_Wrapped = TypeVar('_Wrapped')
class BaseGreet(Generic[_Wrapped]):
"""Abstract class of all other """
def __init__(self, wrapped: _Wrapped) -> None:
self._wrapped = wrapped
@abc.abstractmethod
def greet(self) -> str:
raise NotImplementedError
class StrGreet(BaseGreet[str]):
"""Wrapped instance of built-in type ``str``."""
def greet(self) -> str:
return 'Hello, {0}!'.format(self._wrapped)
# Our custom type:
@dataclass
class MyUser(object):
name: str
class MyUserGreet(BaseGreet[MyUser]):
def greet(self) -> str:
return 'Hello again, {0}'.format(self._wrapped.name)
And we can use it like so:
print(greet(MyStrGreet('world')))
# Hello, world!
print(greet(MyUserGreet(MyUser(name='example'))))
# Hello again, example
But, now we have a different problem: we have a gap between real types and their wrappers. There's no easy way to wrap a type into its wrapper. How can we match them? We have to do it either by hand or use some kind of registry like Dict[type, Type[BaseGreet]]
.
And it is still not enough, there will be runtime errors! In practice, it ends up like <X> is not json-serializable
as many of us might have seen it with drf
's serializers when trying to serialize a custom unregistered type.
Typeclasses and similar concepts
Let's look at how functional languages (and Rust
, people still argue whether it is functional or not) handle this problem.
Some common knowledge:
- All these languages don't have
class
concept as we know it in Python and, of course, there's no subclassing - All the languages below don't have
object
s as we do in Python, they don't mix behavior and structure (however,Elixir
has Alan Kay's real objects) - Instead, these languages use ad-hoc polymorphism to make functions behave differently for different types via overloading
- And, of course, you don't have to know any of the languages below to understand what is going on
Elixir
Let's start with one of my favorites.
Elixir
has Protocol
s to achieve what we want:
@doc "Our custom protocol"
defprotocol Greet do
# This is an abstract function,
# that will behave differently for each type.
def greet(data)
end
@doc "Enhancing built-in type"
defimpl Greet, for: BitString do
def greet(string), do: "Hello, #{string}!"
end
@doc "Custom data type"
defmodule MyUser do
defstruct [:name]
end
@doc "Enhancing our own type"
defimpl Greet, for: MyUser do
def greet(user), do: "Hello again, #{user.name}"
end
I am pretty sure that my readers were able to read and understand Elixir
even if they are not familiar with this language. That's what I call beauty!
Usage of the code above:
# Using our `Greet.greet` function with both our data types:
IO.puts(Greet.greet("world"))
# Hello, world!
IO.puts(Greet.greet(%MyUser{name: "example"}))
# Hello again, example
The thing with Elixir
's Protocol
s is that it is not currently possible to express that some type does support our Greet.greet
for Elixir
's type checker.
But, this is not a big deal for Elixir
, which is 100% dynamically typed.
Protocols are very widely used, they power lots of the language's features.
Here are some real-life examples:
-
Enumerable
allows to work with collections: counting elements, finding members, reducing, and slicing -
String.Chars
is something like__str__
in Python, it converts structures to human-readable format
Rust
Rust
has Trait
s. The concept is pretty similar to Protocol
s in Elixir
:
// Our custom trait
trait Greet {
fn greet(&self) -> String;
}
// Enhancing built-in type
impl Greet for String {
fn greet(&self) -> String {
return format!("Hello, {}!", &self);
}
}
// Defining our own type
struct MyUser {
name: String,
}
// Enhancing it
impl Greet for MyUser {
fn greet(&self) -> String {
return format!("Hello again, {}", self.name);
}
}
And of course, due to Rust
's static typing, we can express that some function's argument supports the trait we have just defined:
// We can express that `greet` function only accepts types
// that implement `Greet` trait:
fn greet(instance: &dyn Greet) -> String {
return instance.greet();
}
pub fn main() {
// Using our `greet` function with both our data types:
println!("{}", greet(&"world".to_string()));
// Hello, world!
println!("{}", greet(&MyUser { name: "example".to_string() }));
// Hello again, example
}
See? The idea is so similar, that it uses almost the same syntax as Elixir
.
Notable real-life examples of how Rust
uses its Trait
s:
-
Copy
andClone
- duplicating objects -
Debug
to show betterrepr
of an object, again like__str__
in Python
Basically, Trait
s are the core of this language, it is widely used in cases when you need to define any shared behavior.
Haskell
Haskell
has typeclasses to do almost the same thing.
So, what's a typeclass?
Typeclass is a group of types, all of which satisfy some common contract.
It is also a form of ad-hoc polymorphism that is mostly used for overloading.
I am a bit sorry for the Haskell
syntax below, it might be not very pleasant and clear to read, especially for people who are not familiar with this brilliant language, but we have what we have:
{-# LANGUAGE FlexibleInstances #-}
-- Our custom typeclass
class Greet instance where
greet :: instance -> String
-- Enhancing built-in type with it
instance Greet String where
greet str = "Hello, " ++ str ++ "!"
-- Defining our own type
data MyUser = MyUser { name :: String }
-- Enhancing it
instance Greet MyUser where
greet user = "Hello again, " ++ (name user)
Basically, we do the same thing as we have already done for Rust
and Elixir
:
- We define a
Greet
typeclass that has a single function to implement:greet
- Then we define instance implementation for
String
type, which is a built-in (alias for[Char]
) - Then we define custom
MyUser
type withname
field ofString
type - Implementing the
Greet
typeclass forMyUser
is the last thing we do
Then we can use our new greet
function:
-- Here you can see that we can use `Greet` typeclass to annotate our types.
-- I have made this alias entirely for this annotation demo,
-- in real life we would just use `greet` directly:
greetAlias :: Greet instance => instance -> String
greetAlias = greet
main = do
print $ greetAlias "world"
-- Hello, world!
print $ greetAlias MyUser { name="example" }
-- Hello again, example
Some real-life examples of typeclasses:
-
Show
to convert things into user-readable representations -
Functor
,Applicate
, andMonad
are all typeclasses
I would say that among our three examples, Haskell
relies on its typeclasses the heaviest.
It is important to note that typeclasses from Haskell
and traits from Rust
are a bit different, but we won't go into these details to keep this article rather short.
But, what about Python?
dry-python/classes
There's an awesome function in the Python standard library called singledispatch
.
It does exactly what we need. Do you still remember that we are finding a way to change the function's behavior based on the input type?
Let's have a look!
from functools import singledispatch
@singledispatch
def greet(instance) -> str:
"""Default case."""
raise NotImplementedError
@greet.register
def _greet_str(instance: str) -> str:
return 'Hello, {0}!'.format(instance)
# Custom type
@dataclass
class MyUser(object):
name: str
@greet.register
def _greet_myuser(instance: MyUser) -> str:
return 'Hello again, {0}'.format(instance.name)
Looks cool, moreover, it is in standard lib, you even don't have to install anything!
And we can use it like a normal function:
print(greet('world'))
# Hello, world!
print(greet(MyUser(name='example')))
# Hello again, example
So, what's the point in writing a completely different library like we did with dry-python/classes
?
We even reuse some parts of singledispatch
implementation,
but there are several key differences.
Better typing
With singledispatch
you cannot be sure that everything will work, because it is not supported by mypy
.
For example, you can pass unsupported types:
greet(1) # mypy is ok with that :(
# runtime will raise `NotImplementedError`
In dry-python/classes
we have fixed that.
You can only pass types that are supported:
from classes import typeclass
@typeclass
def greet(instance) -> str:
...
@greet.instance(str)
def _greet_str(instance: str) -> str:
return 'Iterable!'
greet(1)
# Argument 1 to "greet" has incompatible type "int"; expected "str"
Or you can break the @singledispatch
signature contract:
@greet.register
def _greet_dict(instance: dict, key: str) -> int:
return instance[key] # still no mypy error
But, not with dry-python/classes
:
@greet.instance(dict)
def _greet_dict(instance: dict, key: str) -> int:
...
# Instance callback is incompatible
# "def (instance: builtins.dict[Any, Any], key: builtins.str) -> builtins.int";
# expected
# "def (instance: builtins.dict[Any, Any]) -> builtins.str"
@singledispatch
also does not allow defining generic functions:
@singledispatch
def copy(instance: X) -> X:
"""Default case."""
raise NotImplementedError
@copy.register
def _copy_int(instance: int) -> int:
return instance
# Argument 1 to "register" of "_SingleDispatchCallable"
# has incompatible type "Callable[[int], int]";
# expected "Callable[..., X]"
reveal_type(copy(1))
# Revealed type is "X`-1"
# Should be: `int`
Which is, again, possible with dry-python/classes
, we fully support generic functions:
from typing import TypeVar
from classes import typeclass
X = TypeVar('X')
@typeclass
def copy(instance: X) -> X:
...
@copy.instance(int)
def _copy_int(instance: int) -> int:
... # ok
reveal_type(copy(1)) # int
And you cannot restrict @singledispatch
to work with only subtypes of specific types, even if you want to.
Protocols are unsupported
Protocols are an important part of Python. Sadly, they are not supported by @singledispatch
:
@greet.register
def _greet_iterable(instance: Iterable) -> str:
return 'Iterable!'
# TypeError: Invalid annotation for 'instance'.
# typing.Iterable is not a class
Protocols support is also solved with dry-python/classes
:
from typing import Iterable
from classes import typeclass
@typeclass
def greet(instance) -> str:
...
@greet.instance(Iterable, is_protocol=True)
def _greet_str(instance: Iterable) -> str:
return 'Iterable!'
print(greet([1, 2, 3]))
# Iterable!
No way to annotate types
Let's say you want to write a function and annotate one of its arguments that it must support the greet
function. Something like:
def greet_and_print(instance: '???') -> None:
print(greet(instance))
It is impossible with @singledispatch
.
But, you can do it with dry-python/classes
:
from classes import AssociatedType, Supports, typeclass
class Greet(AssociatedType):
"""Special type to represent that some instance can `greet`."""
@typeclass(Greet)
def greet(instance) -> str:
"""No implementation needed."""
@greet.instance(str)
def _greet_str(instance: str) -> str:
return 'Hello, {0}!'.format(instance)
def greet_and_print(instance: Supports[Greet]) -> None:
print(greet(instance))
greet_and_print('world') # ok
greet_and_print(1) # type error with mypy, exception in runtime
# Argument 1 to "greet_and_print" has incompatible type "int";
# expected "Supports[Greet]"
Conclusion
We have come a long way, from basic stacked isinstance()
conditions - through OOP - to typeclasses.
I have shown, that this native and pythonic idea deserves wider recognition and usage. And our extra features in dry-python/classes
can save you from lots of mistakes and help to write more expressive and safe business logic.
As a result of using typeclasses, you will untangle your structures from behavior, which will allow you to get rid of useless and complex abstractions and write dead-simple typesafe code. You will have your behavior near the structures, not inside them. This will also solve the extendability problem of OOP.
Combine it with other dry-python
libraries for extra effect!
Future work
What do we plan for the future?
There are several key aspects to improve:
- Our
Supports
should take any amount of type arguments:Supports[A, B, C]
. This type will represent a type that supports all three typeclassesA
,B
, andC
at the same time - We don't support concrete generics just yet. So, for example, it is impossible to define different cases for
List[int]
andList[str]
. This might require adding runtime typecheker todry-python/classes
- I am planning to make tests a part of this app as well! We will ship a hypothesis plugin to test users' typeclasses in a single line of code
Stay tuned!
If you like this article you can:
- Donate to future
dry-python
development on GitHub - Star our
classes
repo - Subscribe to my blog for more content!