Originally published in my blog: https://sobolevn.me/2020/06/how-async-should-have-been
In the last few years async
keyword and semantics made its way into many popular programming languages: JavaScript, Rust, C#, and many others languages that I don't know or don't use.
Of course, Python also has async
and await
keywords since python3.5
.
In this article, I would like to provide my opinion about this feature, think of alternatives, and provide a new solution.
Colours of functions
When introducing async
functions into the languages, we actually end up with a split world. Now, some functions start to be red (or async
) and old ones continue to be blue (sync).
The thing about this division is that blue functions cannot call red ones.
Red ones potentially can call blue ones. In Python, for example, it is partially true. Async functions can only call sync non-blocking functions. Is it possible to tell whether this function is blocking or not by its definition? Of course not! Python is a scripting language, don't forget about that!
This division creates two subsets of a single language: sync and async ones.
5 years passed since the release of python3.5
, but async
support is not even near to what we have in the sync python world.
Read this brilliant piece if you want to learn more about colors of functions.
Code duplication
Different colors of functions lead to a more practical problem: code duplication.
Imagine, that you are writing a CLI tool to fetch sizes of web pages. And you want to support both sync and async ways of doing it. This might be very useful for library authors when you don't know how your code is going to be used. It is not limited to just PyPI libraries, but also includes your in-company libraries with shared logic for different services written, for example, in Django and aiohttp. Or any other sync and async code. But, I must admit that single applications are mostly written in sync or async way only.
Let's start with the sync pseudo-code:
def fetch_resource_size(url: str) -> int:
response = client_get(url)
return len(response.content)
Looking pretty good! Now, let's add its async counterpart:
async def fetch_resource_size(url: str) -> int:
response = await client_get(url)
return len(response.content)
It is basically the same code, but filled with async
and await
keywords!
And I am not making this up, just compare code sample in httpx
tutorial:
They show exactly the same picture.
Abstraction and Composition
Ok, we find ourselves in a situation where we need to rewrite all sync code and add async
and await
keywords here and there, so our program would become asynchronous.
These two principles can help us in solving this problem.
First of all, let's rewrite our imperative pseudo-code into a functional pseudo-code. This will allow us to see the pattern more clearly:
def fetch_resource_size(url: str) -> Abstraction[int]:
return client_get(url).map(
lambda response: len(response.content),
)
What is this .map
method? What does it do?
This is a functional way of composing complex abstractions and pure functions. This method allows creating a new abstraction from the existing one with the new state. Let's say that when we call client_get(url)
it initially returns Abstraction[Response]
and calling .map(lambda response: len(response.content))
transforms it to the needed Abstraction[int]
instance.
Now the steps are pretty clear! Notice how easily we went from several independent steps into a single pipeline of function calls. We have also changed the return type of this function: now it returns some Abstraction
.
Now, let's rewrite our code to work with async version:
def fetch_resource_size(url: str) -> AsyncAbstraction[int]:
return client_get(url).map(
lambda response: len(response.content),
)
Wow, that's mostly it! The only thing that is different is the AsyncAbstraction
return type. Other than that, our code stayed exactly the same. We also don't need to use async
and await
keywords anymore. We don't use await
at all (that's the whole point of our journey!), and async
functions do not make any sense without await
.
The last thing we need is to decide which client we want: async or sync one. Let's fix that!
def fetch_resource_size(
client_get: Callable[[str], AbstactionType[Response]],
url: str,
) -> AbstactionType[int]:
return client_get(url).map(
lambda response: len(response.content),
)
Our client_get
is now an argument of a callable type that receives a single URL string as an input and returns some AbstractionType
over Response
object. This AbstractionType
is either Abstraction
or AsyncAbstraction
we have already seen on the previous samples.
When we pass Abstraction
our code works like a sync one, when AsyncAbstraction
is passed, the same code automatically starts to work asynchronously.
IOResult and FutureResult
Luckily, we already have the right abstractions in dry-python/returns
!
Let me introduce to you type-safe, mypy
-friendly, framework-independent, pure-python tool to provide you awesome abstractions you can use in any project!
Sync version
Before we go any further, to make this example reproducible, I need to provide dependencies that are going to be used later:
pip install returns httpx anyio
Let's move on!
One can rewrite this pseudo-code as a real working python code. Let's start with the sync version:
from typing import Callable
import httpx
from returns.io import IOResultE, impure_safe
def fetch_resource_size(
client_get: Callable[[str], IOResultE[httpx.Response]],
url: str,
) -> IOResultE[int]:
return client_get(url).map(
lambda response: len(response.content),
)
print(fetch_resource_size(
impure_safe(httpx.get),
'https://sobolevn.me',
))
# => <IOResult: <Success: 27972>>
We have changed a couple of things to make our pseudo-code real:
- We now use
IOResultE
which is a functional way to handle syncIO
that might fail. Remember, exceptions are not always welcome!Result
-based types allow modeling exceptions as separateFailure()
values. While successful values are wrapped inSuccess
type. In a traditional approach, no one cares about exceptions. But, we do care ❤️ - We use
httpx
that can work with sync and async requests - We use
impure_safe
function to convert the return type ofhttpx.get
to return the abstraction we need:IOResultE
Now, let's try the async version!
Async version
from typing import Callable
import anyio
import httpx
from returns.future import FutureResultE, future_safe
def fetch_resource_size(
client_get: Callable[[str], FutureResultE[httpx.Response]],
url: str,
) -> FutureResultE[int]:
return client_get(url).map(
lambda response: len(response.content),
)
page_size = fetch_resource_size(
future_safe(httpx.AsyncClient().get),
'https://sobolevn.me',
)
print(page_size)
print(anyio.run(page_size.awaitable))
# => <FutureResult: <coroutine object async_map at 0x10b17c320>>
# => <IOResult: <Success: 27972>>
Notice, that we have exactly the same result,
but now our code works asynchronously.
And its core part didn't change at all!
However, it has some important notes:
- We changed sync
IOResultE
into asyncFutureResultE
andimpure_safe
tofuture_safe
, which does the same thing but returns another abstraction:FutureResultE
- We now also use
AsyncClient
fromhttpx
- We are also required to run the resulting
FutureResult
value. Because red functions cannot run themselves! To demonstrate that this approach works with any async library (asyncio
,trio
,curio
), I am usinganyio
utility
Combining the two
And now I can show you how you can combine
these two versions into a single type-safe API.
Sadly, Higher Kinded Types and proper type-classes are work-in-progress, so we would use regular @overload
function cases:
from typing import Callable, Union, overload
import anyio
import httpx
from returns.future import FutureResultE, future_safe
from returns.io import IOResultE, impure_safe
@overload
def fetch_resource_size(
client_get: Callable[[str], IOResultE[httpx.Response]],
url: str,
) -> IOResultE[int]:
"""Sync case."""
@overload
def fetch_resource_size(
client_get: Callable[[str], FutureResultE[httpx.Response]],
url: str,
) -> FutureResultE[int]:
"""Async case."""
def fetch_resource_size(
client_get: Union[
Callable[[str], IOResultE[httpx.Response]],
Callable[[str], FutureResultE[httpx.Response]],
],
url: str,
) -> Union[IOResultE[int], FutureResultE[int]]:
return client_get(url).map(
lambda response: len(response.content),
)
With @overload
decorators we describe which combinations of inputs are allowed. And what return type will they produce. You can read more about @overload
decorator here.
Finally, calling our function with both sync and async client:
# Sync:
print(fetch_resource_size(
impure_safe(httpx.get),
'https://sobolevn.me',
))
# => <IOResult: <Success: 27972>>
# Async:
page_size = fetch_resource_size(
future_safe(httpx.AsyncClient().get),
'https://sobolevn.me',
)
print(page_size)
print(anyio.run(page_size.awaitable))
# => <FutureResult: <coroutine object async_map at 0x10b17c320>>
# => <IOResult: <Success: 27972>>
As you can see fetch_resource_size
with sync client immediately returns IOResult
and can execute itself.
In contrast to it, async version requires some event-loop to execute it. Like a regular coroutine. We use anyio
for the demo.
mypy
is pretty happy about our code too:
» mypy async_and_sync.py
Success: no issues found in 1 source file
Let's try to screw something up:
---lambda response: len(response.content),
+++lambda response: response.content,
And check that new error will be catched by mypy
:
» mypy async_and_sync.py
async_and_sync.py:33: error: Argument 1 to "map" of "IOResult" has incompatible type "Callable[[Response], bytes]"; expected "Callable[[Response], int]"
async_and_sync.py:33: error: Argument 1 to "map" of "FutureResult" has incompatible type "Callable[[Response], bytes]"; expected "Callable[[Response], int]"
async_and_sync.py:33: error: Incompatible return value type (got "bytes", expected "int")
As you can see, there's nothing magical in a way how async code can be written with right abstractions. Inside our implementation, there's still no magic. Just good old composition. What we real magic we do is providing the same API for different types - this allows us to abstract away how, for example, HTTP requests work: synchronously or asynchronously.
I hope, that this quick demo shows how awesome async
programs can be!
Feel free to try new dry-python/returns@0.14
release, it has lots of other goodies!
Other awesome features
Speaking about goodies, I want to highlight several of features I am most proud of:
-
Typed
partial
and@curry
functions
from returns.curry import curry, partial
def example(a: int, b: str) -> float:
...
reveal_type(partial(example, 1))
# note: Revealed type is 'def (b: builtins.str) -> builtins.float'
reveal_type(curry(example))
# note: Revealed type is 'Overload(def (a: builtins.int) -> def (b: builtins.str) -> builtins.float, def (a: builtins.int, b: builtins.str) -> builtins.float)'
Which means, that you can use @curry
like so:
@curry
def example(a: int, b: str) -> float:
return float(a + len(b))
assert example(1, 'abc') == 4.0
assert example(1)('abc') == 4.0
You can now use functional pipelines with full type inference that is augmentated by a custom mypy
plugin:
from returns.pipeline import flow
assert flow(
[1, 2, 3],
lambda collection: max(collection),
lambda max_number: -max_number,
) == -3
We all know how hard it is to work with lambda
s in typed code because its arguments always have Any
type. And this might break regular mypy
inference.
Now, we always know that lambda collection: max(collection)
has Callable[[List[int]], int]
type inside this pipeline. And lambda max_number: -max_number
is just Callable[[int], int]
. You can pass any number of arguments to flow
, they all will work perfectly. Thanks to our custom plugin!
It is an abstraction over FutureResult
we have already covered in this article.
It might be used to explicitly pass dependencies in a functional manner in your async programs.
To be done
However, there are more things to be done before we can hit 1.0
:
- We need to implement Higher Kinded Types or their emulation, source
- Adding proper type-classes, so we can implement needed abstractions, source
- We would love to try the
mypyc
compiler. It will potentially allow us to compile our typed-annotated Python program to binary. And as a result, simply dropping indry-python/returns
into your program will make it several times faster, source - Finding new ways to write functional Python code, like our on-going investigation of "do-notation"
Conclusion
Composition and abstraction can solve any problem. In this article, I have shown you how they can solve a problem of function colors and allow people to write simple, readable, and still flexible code that works. And type-checks.
Check out dry-python/returns
, provide your feedback, learn new ideas, maybe even help us to sustain this project!
And as always, follow me on GitHub to keep up with new awesome Python libraries I make or help with!