The unreasonable effectiveness of sys.settrace (and sys.setprofile)

CPython allows for incredible introspection: settrace and setprofile can observe and control just about everything from within the program. The official documentation doesn't do enough justice to these functions, and so I decided to take a stab at it.

sys.settrace sets a trace function that is triggered by the VM every time it enters or exits a function, processes a line of code, runs into an exception and even when it runs a single opcode. Each of these callbacks also give full access to the current stack frame and code.

sys.setprofile sets a trace function that's triggered every time the VM enters or exits both Python and C functions.

There are equivalents available in the threading module to allow setting tracers for multi-threaded programs as well: threading.settrace and threading.setprofile.

The frame object contains the current state of computation: all locals, globals, current location, etc.; the code object has the actual opcodes and function details independent of the current stack.

Frighteningly enough, it's also possible to modify the state of the program using the frame: so it's possible to override locals, globals, or simply skip around by changing the offset into the bytecode.

Cheat Sheet

sys.settrace(fn) and sys.setprofile(fn) to set fn as the callback.
sys.settrace(None) and sys.setprofile(None) to clear it.
The trace function has the function signature (frame, event, arg).
settrace events: call, return, line, exception, opcode
setprofile events: call, c_call, return, c_return, c_exception
The return value is used in all but return to specify the new local trace function for settrace, and is unused for setprofile.
arg depends on the event
- c_call, c_return, c_exception: C function object
- return: The return value, None for exceptions
- exception: (exception, value, traceback) tuple
frame fields
- f_back: previous frame
- f_code: code object
- f_locals: as title
- f_globals: as title
- f_builtins: built-in names
- f_lasti: instruction, index into bytecode string
- f_trace: writable, set the trace function for this frame
- f_trace_lines: writable, trigger trace for each line
- f_trace_opcodes: writable, trigger trace for each opcode
- f_lineno: writable, current line number
code fields (immutable)
- co_name: function name
- co_argcount: positional arguments
- co_posonlargcount: positional-only arguments
- co_kwonlyargcount: keyword-only arguments
- co_nlocals: # of local variables used by the function, including args
- co_varnames: tuple with names of local variables
- co_cellvars: tuple with local vars referred to by nested functions
- co_freevars: tuple with free variables
- co_code: bytecode as a string
- co_consts: tuple with literals used by bytecode
  - first item is docstring or None for functions
- co_names: tuple with names used by bytecode
- co_filename: origin filename
- co_firstlineno: first line number of function
- co_lnotab: maps bytecode offsets to line numbers
- co_flags: flags for interpreter
  - 0x04: uses *arguments
  - 0x08: uses **keywords
  - 0x20: generator
  - 0x2000: from future import division

A full example of a `settrace`

A short Fibonacci loop serves as a good example to show the values inside a trace.

def fib(n):
    i, f1, f2 = 1, 1, 1
    while i < n:
        f1, f2 = f2, f1 + f2
        i += 1
    return f1

show_trace is a simple tracing function to give a live example. Note that f_trace_opcodes needs to be explicitly set to True to trace opcodes in the frame, and import opcode gets the opcode number to name mapping.

import opcode  


def show_trace(frame, event, arg):
    frame.f_trace_opcodes = True
    code = frame.f_code
    offset = frame.f_lasti

    print(f"| {event:10} | {str(arg):>4} |", end=' ')
    print(f"{frame.f_lineno:>4} | {frame.f_lasti:>6} |", end=' ')
    print(f"{opcode.opname[code.co_code[offset]]:<18} | {str(frame.f_locals):<35} |")
    return show_trace

Finally, actually triggering the trace – and pretty printing the output. Here we see the bytecode actually being executed; in contrast, dis.dis shows the disassembled bytecode representing the function.

import sys

header = f"| {'event':10} | {'arg':>4} | line | offset | {'opcode':^18} | {'locals':^35} |"
print(header)
sys.settrace(show_trace)
fib(3)
sys.settrace(None)

| event      |  arg | line | offset |       opcode       |               locals                |
| call       | None |    1 |      0 | RESUME             | {'n': 3}                            |
| line       | None |    2 |      2 | LOAD_CONST         | {'n': 3}                            |
| line       | None |    3 |     14 | LOAD_FAST          | {'n': 3, 'i': 1, 'f1': 1, 'f2': 1}  |
| line       | None |    4 |     24 | LOAD_FAST          | {'n': 3, 'i': 1, 'f1': 1, 'f2': 1}  |
| line       | None |    5 |     38 | LOAD_FAST          | {'n': 3, 'i': 1, 'f1': 1, 'f2': 2}  |
| line       | None |    3 |     48 | LOAD_FAST          | {'n': 3, 'i': 2, 'f1': 1, 'f2': 2}  |
| line       | None |    4 |     24 | LOAD_FAST          | {'n': 3, 'i': 2, 'f1': 1, 'f2': 2}  |
| line       | None |    5 |     38 | LOAD_FAST          | {'n': 3, 'i': 2, 'f1': 2, 'f2': 3}  |
| line       | None |    3 |     48 | LOAD_FAST          | {'n': 3, 'i': 3, 'f1': 2, 'f2': 3}  |
| line       | None |    6 |     60 | LOAD_FAST          | {'n': 3, 'i': 3, 'f1': 2, 'f2': 3}  |
| return     |    2 |    6 |     62 | RETURN_VALUE       | {'n': 3, 'i': 3, 'f1': 2, 'f2': 3}  |

The underlying code of a function is directly accessible with the __code__ field, and is also the value stored in the frame's co_code field.

import inspect

code = fib.__code__
print(f"{'field':<20} | value")
for name, val in inspect.getmembers(code):
    if name.startswith("co_"):
        print(f"{name:<20} | {val}")

field                | value
co_argcount          | 1
co_cellvars          | ()
co_code              | b'\x97\x00d\x01\\\x03\x00\x00}\x01}\x02}\x03|\x01|\x00k\x02\x00\x00r\x12|\x03|\x02|\x03z\x00\x00\x00}\x03}\x02|\x01d\x02z\r\x00\x00}\x01|\x01|\x00k\x02\x00\x00r\x01\x8c\x12|\x02S\x00'
co_consts            | (None, (1, 1, 1), 1)
co_exceptiontable    | b''
co_filename          | /tmp/python-Rpavws
co_firstlineno       | 1
co_flags             | 3
co_freevars          | ()
co_kwonlyargcount    | 0
co_lines             | <built-in method co_lines of code object at 0x7f2a54808a30>
co_linetable         | b'\x80\x00\xd8\x10\x17\x81I\x80A\x80r\x882\xd8\n\x0b\x88a\x8a%\xd8\x11\x13\x90R\x98"\x91W\x88B\x88\x02\xd8\x08\t\x88Q\x89\x06\x88\x01\xf0\x05\x00\x0b\x0c\x88a\x8b%\xf0\x06\x00\x0c\x0e\x80I'
co_lnotab            | b'\x02\x01\x0c\x01\n\x01\x0e\x01\n\xfe\x0c\x03'
co_name              | fib
co_names             | ()
co_nlocals           | 4
co_positions         | <built-in method co_positions of code object at 0x7f2a54808a30>
co_posonlyargcount   | 0
co_qualname          | fib
co_stacksize         | 3
co_varnames          | ('n', 'i', 'f1', 'f2')

Potentially useful patterns

A completely programmable ad-hoc debugger

I've found settrace with a simple function to be extremely valuable in understanding code.

Instead of learning even more arcane commands and manually stepping through programs with pdb, I can directly write my "pause" and "inspect" logic into the trace function and keep iterating on the program – while also filtering the output to files I care about.

The tracer module also helps with directly tracing and dumping the code, but I generally find it too unwieldy compared to directly using settrace.

Building code execution visualizations

Working on mine is what inspired this note: it's not hard to imagine beautiful ways to display code execution. My approach is to use Catapult (chrome://tracing) for rendering the execution, but it could just as easily be on the source itself inside an IDE.

Figure 1: Visualizing the broken behavior described in nullprogram

Test coverage

settrace makes it trivial to observe the lines of code executed in any specific program run, and the underling C function is how coverage.py works.

They didn't stop to think if they should

Jump to arbitrary parts of a function

I vaguely remember seeing a Hacker News post about jump implemented with settrace and then directly writing to the frame offset.

Recreating that is left as an exercise to the reader.

Rewrite arguments on the fly

For bonus points, you can include this code and have it trigger randomly.

import sys

def square_arguments(frame, event, arg):
    if event != "call":
        return

    code = frame.f_code
    if code.co_name != "add":
        return

    for var in frame.f_locals:
        frame.f_locals[var] = frame.f_locals[var] ** 2

def add(a, b):
    return a + b

print(f"add(1, 2): {add(1, 2)}")

sys.settrace(square_arguments)
print(f"add(1, 2): {add(1, 2)}")

add(1, 2): 3
add(1, 2): 5

Not quite a deus-ex machina

Tracing doesn't capture coroutine creation, which would be amazing. With sys.set_coroutine_origin_tracking it's possible to get a stack trace to where a specific coroutine was created, but I'd love for that step to also pass through settrace / setprofile.

It's also not directly possible to set this from outside an arbitrary running program that I know of.

Finally, setprofile intercepts and shares exceptions thrown from C code, but doesn't intercept Python exceptions. I'm not quite sure why.

References

sys.settrace in the standard python documentation.
The Python Type Hierarchy, with definitions of frame and code objects.
inspect, with very readable documentation that makes my cheat sheet irrelevant.

Comments?

Drop me an email or reach out on Twitter.

History

2020-06-13: Published.
(TODO Explore PyEval_SetTrace.)