Skip to content

Commit a879805

Browse files
committed
edit exception handling
1 parent cd7725f commit a879805

File tree

1 file changed

+101
-103
lines changed

1 file changed

+101
-103
lines changed

InternalDocs/exception_handling.md

Lines changed: 101 additions & 103 deletions
Original file line numberDiff line numberDiff line change
@@ -1,94 +1,88 @@
1-
Description of exception handling in Python 3.11
2-
------------------------------------------------
1+
Description of exception handling
2+
---------------------------------
33

4-
Python 3.11 uses what is known as "zero-cost" exception handling.
5-
Prior to 3.11, exceptions were handled by a runtime stack of "blocks".
6-
7-
In zero-cost exception handling, the cost of supporting exceptions is minimized.
8-
In the common case (where no exception is raised) the cost is reduced
9-
to zero (or close to zero).
4+
Python uses a technique known as "zero-cost" exception handling, which
5+
minimizes the cost of supporting exceptions. In the common case (where
6+
no exception is raised) the cost is reduced to zero (or close to zero).
107
The cost of raising an exception is increased, but not by much.
118

129
The following code:
1310

14-
def f():
15-
try:
16-
g(0)
17-
except:
18-
return "fail"
19-
20-
compiles as follows in 3.10:
21-
22-
2 0 SETUP_FINALLY 7 (to 16)
23-
24-
3 2 LOAD_GLOBAL 0 (g)
25-
4 LOAD_CONST 1 (0)
26-
6 CALL_NO_KW 1
27-
8 POP_TOP
28-
10 POP_BLOCK
29-
12 LOAD_CONST 0 (None)
30-
14 RETURN_VALUE
31-
32-
4 >> 16 POP_TOP
33-
18 POP_TOP
34-
20 POP_TOP
35-
36-
5 22 POP_EXCEPT
37-
24 LOAD_CONST 3 ('fail')
38-
26 RETURN_VALUE
39-
40-
Note the explicit instructions to push and pop from the "block" stack:
41-
SETUP_FINALLY and POP_BLOCK.
42-
43-
In 3.11, the SETUP_FINALLY and POP_BLOCK are eliminated, replaced with
44-
a table to determine where to jump to when an exception is raised.
45-
46-
1 0 RESUME 0
47-
48-
2 2 NOP
49-
50-
3 4 LOAD_GLOBAL 1 (g + NULL)
51-
16 LOAD_CONST 1 (0)
52-
18 PRECALL 1
53-
22 CALL 1
54-
32 POP_TOP
55-
34 LOAD_CONST 0 (None)
56-
36 RETURN_VALUE
57-
>> 38 PUSH_EXC_INFO
58-
59-
4 40 POP_TOP
60-
61-
5 42 POP_EXCEPT
62-
44 LOAD_CONST 2 ('fail')
63-
46 RETURN_VALUE
64-
>> 48 COPY 3
65-
50 POP_EXCEPT
66-
52 RERAISE 1
67-
ExceptionTable:
68-
4 to 32 -> 38 [0]
69-
38 to 40 -> 48 [1] lasti
70-
71-
(Note this code is from 3.11, later versions may have slightly different bytecode.)
72-
73-
If an instruction raises an exception then its offset is used to find the target to jump to.
74-
For example, the CALL at offset 22, falls into the range 4 to 32.
75-
So, if g() raises an exception, then control jumps to offset 38.
76-
11+
<code>
12+
try:
13+
g(0)
14+
except:
15+
res = "fail"
16+
</code>
17+
18+
compiles into pseudo-code like the following:
19+
20+
<code>
21+
`RESUME` 0
22+
23+
1 `SETUP_FINALLY` 8 (to L1)
24+
25+
2 `LOAD_NAME` 0 (g)
26+
`PUSH_NULL`
27+
`LOAD_CONST` 0 (0)
28+
`CALL` 1
29+
`POP_TOP`
30+
`POP_BLOCK`
31+
32+
-- L1: `PUSH_EXC_INFO`
33+
34+
3 `POP_TOP`
35+
36+
4 `LOAD_CONST` 1 ('fail')
37+
`STORE_NAME` 1 (res)
38+
</code>
39+
40+
The `SETUP_FINALLY` instruction specifies that henceforth, exceptions
41+
are handled by the code at label L1. The `POP_BLOCK` instruction
42+
reverses the effect of the last `SETUP_FINALLY`, so the exception
43+
handler reverts to what it was before.
44+
45+
Note that the `SETUP_FINALLY` and `POP_BLOCK` instructions have no effect
46+
when no exceptions are raised. The idea of zero-cost exception handling
47+
is to replace these instructions by metadata which is stored alongside
48+
the code, and which is inspected only when an exception occurs.
49+
This metadata is the exception table, which is stored in the code
50+
object's `co_exceptiontable` field.
51+
52+
When the pseudo-instructions are translated into bytecode, the
53+
`SETUP_FINALLY` and `POP_BLOCK` instructions are removed, and the
54+
exception table is constructed, mapping each instruction to the
55+
the exception handler that covers it, if any. Instructions which
56+
are not covered by any exception handler within the same code
57+
object's bytecode, do not appear in the exception table at all.
58+
59+
For the code object in our example above, the table has a single
60+
entry specifying that all instructions between the `SETUP_FINALLY`
61+
and the `POP_BLOCK` are covered by the exception handler located
62+
at label `L1`.
63+
64+
At runtime, when an exception occurs, the interpreted looks up
65+
the offset of the current instruction in the exception table. If
66+
it finds a handler, control flow transfers to it. Otherwise, the
67+
exception bubbles up to the caller, and the caller's frame is
68+
checked for a handler covering the `CALL` instruction. This
69+
repeats until a handler is found or the topmost frame is reached,
70+
and the program terminates. During unwinding, the traceback
71+
is constructed.
7772

7873
Unwinding
7974
---------
8075

81-
When an exception is raised, the current instruction offset is used to find following:
82-
target to jump to, stack depth, and 'lasti', which determines whether the instruction
83-
offset of the raising instruction should be pushed.
84-
85-
This information is stored in the exception table, described below.
76+
Along with the location of an exception handler, each entry of the
77+
exception table also contains the stack depth of the `try` instruction
78+
and a boolean `lasti` value, which indicates whether the instruction
79+
offset of the raising instruction should be pushed to the stack.
8680

87-
If there is no relevant entry, the exception bubbles up to the caller.
81+
Handling an exception, once an exception table entry is found, consists
82+
of the following steps:
8883

89-
If there is an entry, then:
9084
1. pop values from the stack until it matches the stack depth for the handler.
91-
2. if 'lasti' is true, then push the offset that the exception was raised at.
85+
2. if `lasti` is true, then push the offset that the exception was raised at.
9286
3. push the exception to the stack.
9387
4. jump to the target offset and resume execution.
9488

@@ -97,51 +91,51 @@ Format of the exception table
9791
-----------------------------
9892

9993
Conceptually, the exception table consists of a sequence of 5-tuples:
100-
1. start-offset (inclusive)
101-
2. end-offset (exclusive)
102-
3. target
103-
4. stack-depth
104-
5. push-lasti (boolean)
94+
1. `start-offset` (inclusive)
95+
2. `end-offset` (exclusive)
96+
3. `target`
97+
4. `stack-depth`
98+
5. `push-lasti` (boolean)
10599

106-
All offsets and lengths are in instructions, not bytes.
100+
All offsets and lengths are in code units, not bytes.
107101

108102
We want the format to be compact, but quickly searchable.
109103
For it to be compact, it needs to have variable sized entries so that we can store common (small) offsets compactly, but handle large offsets if needed.
110104
For it to be searchable quickly, we need to support binary search giving us log(n) performance in all cases.
111105
Binary search typically assumes fixed size entries, but that is not necessary, as long as we can identify the start of an entry.
112106

113107
It is worth noting that the size (end-start) is always smaller than the end, so we encode the entries as:
114-
start, size, target, depth, push-lasti
108+
`start, size, target, depth, push-lasti`
115109

116-
Also, sizes are limited to 2**30 as the code length cannot exceed 2**31 and each instruction takes 2 bytes.
110+
Also, sizes are limited to 2**30 as the code length cannot exceed 2**31 and each code unit takes 2 bytes.
117111
It also happens that depth is generally quite small.
118112

119113
So, we need to encode:
120-
start (up to 30 bits)
121-
size (up to 30 bits)
122-
target (up to 30 bits)
123-
depth (up to ~8 bits)
124-
lasti (1 bit)
114+
`start` (up to 30 bits)
115+
`size` (up to 30 bits)
116+
`target` (up to 30 bits)
117+
`depth` (up to ~8 bits)
118+
`lasti` (1 bit)
125119

126120
We need a marker for the start of the entry, so the first byte of entry will have the most significant bit set.
127121
Since the most significant bit is reserved for marking the start of an entry, we have 7 bits per byte to encode offsets.
128122
Encoding uses a standard varint encoding, but with only 7 bits instead of the usual 8.
129-
The 8 bits of a bit are (msb left) SXdddddd where S is the start bit. X is the extend bit meaning that the next byte is required to extend the offset.
123+
The 8 bits of a byte are (msb left) SXdddddd where S is the start bit. X is the extend bit meaning that the next byte is required to extend the offset.
130124

131-
In addition, we will combine depth and lasti into a single value, ((depth<<1)+lasti), before encoding.
125+
In addition, we combine `depth` and `lasti` into a single value, `((depth<<1)+lasti)`, before encoding.
132126

133127
For example, the exception entry:
134-
start: 20
135-
end: 28
136-
target: 100
137-
depth: 3
138-
lasti: False
128+
`start`: 20
129+
`end`: 28
130+
`target`: 100
131+
`depth`: 3
132+
`lasti`: False
139133

140134
is encoded first by converting to the more compact four value form:
141-
start: 20
142-
size: 8
143-
target: 100
144-
depth<<1+lasti: 6
135+
`start`: 20
136+
`size`: 8
137+
`target`: 100
138+
`depth<<1+lasti`: 6
145139

146140
which is then encoded as:
147141
148 (MSB + 20 for start)
@@ -157,6 +151,7 @@ for a total of five bytes.
157151
Script to parse the exception table
158152
-----------------------------------
159153

154+
<code>
160155
def parse_varint(iterator):
161156
b = next(iterator)
162157
val = b & 63
@@ -165,7 +160,9 @@ def parse_varint(iterator):
165160
b = next(iterator)
166161
val |= b&63
167162
return val
163+
</code>
168164

165+
<code>
169166
def parse_exception_table(code):
170167
iterator = iter(code.co_exceptiontable)
171168
try:
@@ -180,3 +177,4 @@ def parse_exception_table(code):
180177
yield start, end, target, depth, lasti
181178
except StopIteration:
182179
return
180+
</code>

0 commit comments

Comments
 (0)