expLog

AsyncIO In Depth

Python's async / await and asyncio can be magical and opaque. At the same time it enables concurrency with a few minor adjustments to otherwise synchronous and simple code 1.

This article is a depth-first-traversal into the Python's event loop implementation. After several aborted attempts at writing about asyncio, I think this is the best way to understand the system.

This is not a way to quickly become productive with asyncio: that function is better fulfilled by numerous other articles and tutorials already published. This is the article you should read after becoming vaguely familiar with what it is; when you want to understand the how.

Hello, world!

Let's start the DFS by looking at the classical "Hello, World":

import asyncio

async def hello_world():
    await asyncio.sleep(1)
    print("Hello, world!")

asyncio.run(hello_world())
Hello, world!

async ramifications

Compiling

The Python lexer has to deal with a new keyword to parse this variant of hello_world: we can look at it using the ast module.

(To keep the output focused, I'm eliding the call to await; =async has the spotlight for the moment.)

import ast
import pprint
pprint.pp(ast.dump(ast.parse("""
async def hello_world():
    print("Hello, world!")
""")))
("Module(body=[AsyncFunctionDef(name='hello_world', "
 'args=arguments(posonlyargs=[], args=[], vararg=None, kwonlyargs=[], '
 'kw_defaults=[], kwarg=None, defaults=[]), '
 "body=[Expr(value=Call(func=Name(id='print', ctx=Load()), "
 "args=[Constant(value='Hello, world!', kind=None)], keywords=[]))], "
 'decorator_list=[], returns=None, type_comment=None)], type_ignores=[])')

The function is explicitly detected as an AsyncFunctionDef to mark the addition of the async tag. 2

We can also look at the disassembly of the function definition to see what happens after parsing. I've also included a plain function for comparison.

import ast
import dis

code = dis.Bytecode(compile("""
async def async_hello_world():
    print("Async Hello, world!")

def hello_world():
    print("Hello, world!")
""", filename='<string>', mode='exec'))

print(code.dis())
2           0 LOAD_CONST               0 (<code object async_hello_world at 0x7fa9ec08e2f0, file "<string>", line 2>)
            2 LOAD_CONST               1 ('async_hello_world')
            4 MAKE_FUNCTION            0
            6 STORE_NAME               0 (async_hello_world)

5           8 LOAD_CONST               2 (<code object hello_world at 0x7fa9ec08e450, file "<string>", line 5>)
           10 LOAD_CONST               3 ('hello_world')
           12 MAKE_FUNCTION            0
           14 STORE_NAME               1 (hello_world)
           16 LOAD_CONST               4 (None)
           18 RETURN_VALUE

Surprisingly enough, the generated opcodes for defining the functions are exactly the same. Looking inside the functions themselves also gives the same result.

import dis

async def async_hello_world():
    print("Hello, world!")

def hello_world():
    print("Hello, world!")

print("Async")
dis.dis(async_hello_world)

print("Function")
dis.dis(hello_world)
Async
  4           0 LOAD_GLOBAL              0 (print)
              2 LOAD_CONST               1 ('Hello, world!')
              4 CALL_FUNCTION            1
              6 POP_TOP
              8 LOAD_CONST               0 (None)
             10 RETURN_VALUE
Function
  7           0 LOAD_GLOBAL              0 (print)
              2 LOAD_CONST               1 ('Hello, world!')
              4 CALL_FUNCTION            1
              6 POP_TOP
              8 LOAD_CONST               0 (None)
             10 RETURN_VALUE

The only difference is a flag set on the defined code: CO_COROUTINE which makers a function as a coroutine. The actual compiler code is somewhere around here; but it's worth explicitly looking at the flags.

Once more into the breach:

import dis

async def async_hello_world():
    print("Hello, world!")

def hello_world():
    print("Hello, world!")

async_flags = async_hello_world.__code__.co_flags
standard_flags = hello_world.__code__.co_flags

only_async_flags = async_flags & (~standard_flags)
print(f"{only_async_flags=}, aka {dis.COMPILER_FLAG_NAMES[only_async_flags]}")

only_standard_flags = standard_flags & (~async_flags)
print(f"{only_standard_flags=}")
only_async_flags=128, aka COROUTINE
only_standard_flags=0

And that's the literally the magic bit that makes the difference in how coroutines are evaluated.

Execution

Executing an async function doesn't directly run the inner function anymore, and instead returns a a coroutine object that wraps the computation.

hello_world()
Hello, world!

Coroutine Objects

The object maintains a link to the actual code to be executed, as well as the state of execution by keeping the frame around. This is what makes it possible to "pause" and "resume" a coroutine when out-of-band mechanisms return.

Listing out the non-dunder methods to see what's available with the object.

instance = hello_world()
print([(type(getattr(instance, attribute)), attribute) for attribute in instance.__dir__() if not attribute.startswith("__")])
Hello, world!
[]

Comparing the attributes to confirm that the code is shared

instance_a = hello_world()
instance_b = hello_world()
Hello, world!
Hello, world!
1

: Concurrency would be the ability to interleave multiple computations, parallelism is the ability to run them on multiple cores. I would recommend Parallel and Concurrent Programming in Haskell for a much better description.

2

Just for contrast:

import ast
import pprint

pprint.pprint(ast.dump(ast.parse("""
def hello_world():
    print("Hello, world!")
""")))
("Module(body=[FunctionDef(name='hello_world', args=arguments(posonlyargs=[], "
 'args=[], vararg=None, kwonlyargs=[], kw_defaults=[], kwarg=None, '
 "defaults=[]), body=[Expr(value=Call(func=Name(id='print', ctx=Load()), "
 "args=[Constant(value='Hello, world!', kind=None)], keywords=[]))], "
 'decorator_list=[], returns=None, type_comment=None)], type_ignores=[])')

Normal functions are parsed, unsurprisingly, as FunctionDef.

view source