A few weeks ago I examined Python code objects using the dis module. In that post, I showed several examples of executing code objects created at runtime using the exec statement; here we'll explore compile()'s compliment, exec, how to invoke it, and some of the quirks of using it.

Note that, like my post on code objects, the information here is only relevant to Python 2.x, specifically, Python 2.7. Unlike that post, at least one technique presented here will not work in Python 3.x -- in 3.x, exec is a function, and modifications to locals will not propagate to the calling scope. (Thanks to comex for pointing this out on Hacker News.)

A Refresher: compile()

Before we dive in to exec, recall that compile() converts a string of Python source code into a code object, using the same machinery as the usual Python compilation process. It takes three arguments: the source code to compile, the filename of the source code (for which it is customary to use "" for source code not originating in a file), and the "mode" of the compilation (for which "exec" is the most useful mode, at least for these sorts of experiments).

Make It Go

By itself, a code object is not terribly useful. Sure, it encapsulates all the information needed to execute some Python code, but it is just a noun, with no way to run itself.

>>> code_object = compile("""
... print "Hello, world"
... """, '<string>', 'exec')
>>> code_object()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'code' object is not callable

This isn't terribly useful. Of course we can use the dis module to inspect the bytecode, and its attributes to find out what variables it uses, the source file from which this code object was created, etc, but what we really want is to run it. Enter the exec statement:

>>> exec code_object
Hello, world

That's more like it!

Playing With exec

Imagine that we have a function which accepts a code object and an argument x:

>>> def exec_code_and_return_x(code_object, x):
...     exec code_object
...     return x
...

You might expect this function to return the value of x without modification -- after all, no code in the function actually manipulates x in any way. However (depending on the code object), you can get some very unexpected outcomes:

>>> my_code_object = compile("""
... x = x + 1
... """, '<string>', 'exec')
>>> exec_code_and_return_x(my_code_object, 1):
2

But you won't always get the unexpected results you expect:

>>> my_code_object = compile("""
... del x
... """, '<string>', 'exec')
>>> exec_code_and_return_x(my_code_object, 1)
1

So what's going on here?

On exec and Scoping

Without any additional instructions, exec uses the current global and local namespaces to execute your code. This means that (as we saw above), code objects being exec'd can modify variables in scope when it is exec'd (as well, obviously, as any globals the code object happens to manipulate).

You can also customize (to a certain extent) the scope given to the code object by using the in "arguments" to exec:

>>> code_globals = {}
>>> code_locals = {}
>>> code_object = compile("""
... x = 1
... """, '<string>', 'exec')
>>> exec code_object in code_globals, code_locals
>>> print code_locals
{'x': 1}

Note that builtins, as their name suggests, are always available, even when execing with a empty scopes:

>>> code_globals = {}
>>> code_locals = {}
>>> code_object = compile("""
... print unicode("Hello, world")
... """, '<string>', 'exec')
>>> exec code_object in code_globals, code_locals
Hello, world
>>> code_globals.keys()
['__builtins__']

Solving Our Python Mystery

Returning to our mystery example from before:

>>> my_code_object = compile("""
... del x
... """, '<string>', 'exec')
>>> exec_code_and_return_x(my_code_object, 1)
1

Why can't the code object delete the local variable x? Specifically, why is the above code different from this:

>>> def delete_local_then_return_it():
...     x = 1
...     del x
...     return x
...
>>> delete_local_then_return_it()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 4, in delete_local_then_return_it
UnboundLocalError: local variable 'x' referenced before assignment

What we're seeing here is the result of a Python performance optimization: rather than using a dictionary (or some other mapping object) for function locals, and Python bytecode instructions reference positions within this array. As a consequence of this optimization, dynamic code (such as that returned by compile(), or invocations of exec directly on a string) cannot modify the size of the locals array, since it is fixed at compile time.

Moreover, the implementation of exec, that is, of the EXEC_STMT opcode, arranges for a new Python stack frame to be pushed, which means that it has its own array of locals (those used by the code object being exec'd):

>>> import inspect
>>> def get_stack_depth():
...     frame = inspect.currentframe()
...     # begin depth at -1 to account for the extra
...     # frame created when calling get_stack-depth
...     depth = -1
...     while frame:
...             depth += 1
...             frame = frame.f_back
...     return depth
... 
>>> def show_stack_depths():
...     print 'stack depth in function', get_stack_depth()
...     exec compile("""
...     print "stack depth in exec", get_stack_depth()
...     """, '<string>', 'exec')
... 
>>> show_stack_depths()
stack depth in function 2
stack depth in exec 3

Code objects can, of course, modify the local variables on their own stack frame:

>>> def show_code_object_locals():
...     code_globals = {}
...     code_locals = {'x': 1}
...     exec compile("""del x""", '<string>', 'exec') in code_globals, code_locals
...     return code_locals
... 
>>> show_code_object_locals()
{}

Warnings and Gotchas

Using exec to run code dynamically at runtime is fun, but it is not without its costs.

First, I would be remiss if I did not strongly urge you never to use exec; Python provides no way (and, in fact, there is no general way) to ensure that the code you're attempting to execute will do no harm to your system. Think of this like Python's version of cross-site scripting attacks. If you are ever executing untrusted code submitted by end-users of your system, you're doing it wrong.

Secondly, executing code in this fashion is significantly slower than simply calling regular Python code, as the latter code path is heavily optimized in the interpreter, whereas calling dynamic code is not.

Finally, if you do need to use exec, you may find that it is more convenient to pass a string or file object as the first argument -- however, beware that this will incur the cost of compiling the source code each time the exec statement is encountered. If you are repeatedly execing the same piece of code, you will see significant performance improvement by first compile()ing and saving the resulting code object for re-use.