@@ -11,7 +11,7 @@ The core idea is to track more information about source assignments in order
11
11
and preserve enough information to be able to defer decisions about whether to
12
12
use non-memory locations (register, constant) or memory locations until after
13
13
middle end optimisations have run. This is in opposition to using
14
- ` llvm.dbg.declare ` and ` llvm.dbg.value ` , which is to make the decision for most
14
+ ` #dbg_declare ` and ` #dbg_value ` , which is to make the decision for most
15
15
variables early on, which can result in suboptimal variable locations that may
16
16
be either incorrect or incomplete.
17
17
@@ -26,19 +26,18 @@ except for development and testing.
26
26
** Enable in Clang** : ` -Xclang -fexperimental-assignment-tracking `
27
27
28
28
That causes Clang to get LLVM to run the pass ` declare-to-assign ` . The pass
29
- converts conventional debug intrinsics to assignment tracking metadata and sets
29
+ converts conventional debug records to assignment tracking metadata and sets
30
30
the module flag ` debug-info-assignment-tracking ` to the value ` i1 true ` . To
31
31
check whether assignment tracking is enabled for a module call
32
32
` isAssignmentTrackingEnabled(const Module &M) ` (from ` llvm/IR/DebugInfo.h ` ).
33
33
34
34
## Design and implementation
35
35
36
- ### Assignment markers: ` llvm.dbg.assign `
36
+ ### Assignment markers: ` #dbg_assign `
37
37
38
- ` llvm.dbg.value ` , a conventional debug intrinsic , marks out a position in the
38
+ ` #dbg_value ` , a conventional debug record , marks out a position in the
39
39
IR where a variable takes a particular value. Similarly, Assignment Tracking
40
- marks out the position of assignments with a new intrinsic called
41
- ` llvm.dbg.assign ` .
40
+ marks out the position of assignments with a record called ` #dbg_assign ` .
42
41
43
42
In order to know where in IR it is appropriate to use a memory location for a
44
43
variable, each assignment marker must in some way refer to the store, if any
@@ -48,43 +47,37 @@ important benefit of referring to the store is that we can then build a two-way
48
47
mapping of stores<->markers that can be used to find markers that need to be
49
48
updated when stores are modified.
50
49
51
- An ` llvm.dbg.assign ` marker that is not linked to any instruction signals that
50
+ An ` #dbg_assign ` marker that is not linked to any instruction signals that
52
51
the store that performed the assignment has been optimised out, and therefore
53
52
the memory location will not be valid for at least some part of the program.
54
53
55
- Here's the ` llvm.dbg.assign ` signature. Each parameter is wrapped in
56
- ` MetadataAsValue ` , and ` Value * ` type parameters are first wrapped in
57
- ` ValueAsMetadata ` :
54
+ Here's the ` #dbg_assign ` signature. ` Value * ` type parameters are first wrapped
55
+ in ` ValueAsMetadata ` :
58
56
59
57
```
60
- void @llvm.dbg.assign (Value *Value,
61
- DIExpression *ValueExpression,
62
- DILocalVariable *Variable,
63
- DIAssignID *ID,
64
- Value *Address,
65
- DIExpression *AddressExpression)
58
+ #dbg_assign (Value *Value,
59
+ DIExpression *ValueExpression,
60
+ DILocalVariable *Variable,
61
+ DIAssignID *ID,
62
+ Value *Address,
63
+ DIExpression *AddressExpression)
66
64
```
67
65
68
- The first three parameters look and behave like an ` llvm.dbg.value ` . ` ID ` is a
66
+ The first three parameters look and behave like an ` #dbg_value ` . ` ID ` is a
69
67
reference to a store (see next section). ` Address ` is the destination address
70
68
of the store and it is modified by ` AddressExpression ` . An empty/undef/poison
71
69
address means the address component has been killed (the memory address is no
72
70
longer a valid location). LLVM currently encodes variable fragment information
73
71
in ` DIExpression ` s, so as an implementation quirk the ` FragmentInfo ` for
74
72
` Variable ` is contained within ` ValueExpression ` only.
75
73
76
- The formal LLVM-IR signature is:
77
- ```
78
- void @llvm.dbg.assign(metadata, metadata, metadata, metadata, metadata, metadata)
79
- ```
80
-
81
74
### Instruction link: ` DIAssignID `
82
75
83
76
` DIAssignID ` metadata is the mechanism that is currently used to encode the
84
77
store<->marker link. The metadata node has no operands and all instances are
85
78
` distinct ` ; equality is checked for by comparing addresses.
86
79
87
- ` llvm.dbg.assign ` intrinsics use a ` DIAssignID ` metadata node instance as an
80
+ ` #dbg_assign ` records use a ` DIAssignID ` metadata node instance as an
88
81
operand. This way it refers to any store-like instruction that has the same
89
82
` DIAssignID ` attachment. E.g. For this test.cpp,
90
83
@@ -102,9 +95,9 @@ we get:
102
95
define dso_local noundef i32 @_Z3funi(i32 noundef %a) #0 !dbg !8 {
103
96
entry:
104
97
%a.addr = alloca i32, align 4, !DIAssignID !13
105
- call void @llvm.dbg.assign(metadata i1 undef, metadata !14, metadata !DIExpression(), metadata !13, metadata i32* %a.addr, metadata !DIExpression()) , !dbg !15
98
+ #dbg_assign( i1 undef, !14, !DIExpression(), !13, i32* %a.addr, !DIExpression(), !15)
106
99
store i32 %a, i32* %a.addr, align 4, !DIAssignID !16
107
- call void @llvm.dbg.assign(metadata i32 %a, metadata !14, metadata !DIExpression(), metadata !16, metadata i32* %a.addr, metadata !DIExpression()) , !dbg !15
100
+ #dbg_assign( i32 %a, !14, !DIExpression(), !16, i32* %a.addr, !DIExpression(), !15)
108
101
%0 = load i32, i32* %a.addr, align 4, !dbg !17
109
102
ret i32 %0, !dbg !18
110
103
}
@@ -116,16 +109,16 @@ entry:
116
109
!16 = distinct !DIAssignID()
117
110
```
118
111
119
- The first ` llvm.dbg.assign ` refers to the ` alloca ` through ` !DIAssignID !13 ` ,
112
+ The first ` #dbg_assign ` refers to the ` alloca ` through ` !DIAssignID !13 ` ,
120
113
and the second refers to the ` store ` through ` !DIAssignID !16 ` .
121
114
122
115
### Store-like instructions
123
116
124
- In the absence of a linked ` llvm.dbg.assign ` , a store to an address that is
117
+ In the absence of a linked ` #dbg_assign ` , a store to an address that is
125
118
known to be the backing storage for a variable is considered to represent an
126
119
assignment to that variable.
127
120
128
- This gives us a safe fall-back in cases where ` llvm.dbg.assign ` intrinsics have
121
+ This gives us a safe fall-back in cases where ` #dbg_assign ` records have
129
122
been deleted, the ` DIAssignID ` attachment on the store has been dropped, or the
130
123
optimiser has made a once-indirect store (not tracked with Assignment Tracking)
131
124
direct.
@@ -139,61 +132,61 @@ direct.
139
132
instruction. In this case, the assignment is considered to take place in
140
133
multiple positions in the program.
141
134
142
- ** Moving** a non-debug instruction: nothing new to do. Instructions linked to an
143
- ` llvm.dbg.assign ` have their initial IR position marked by the position of the
144
- ` llvm.dbg.assign ` .
135
+ ** Moving** a non-debug instruction: nothing new to do. Instructions linked to a
136
+ ` #dbg_assign ` have their initial IR position marked by the position of the
137
+ ` #dbg_assign ` .
145
138
146
139
** Deleting** a non-debug instruction: nothing new to do. Simple DSE does not
147
140
require any change; it’s safe to delete an instruction with a ` DIAssignID `
148
- attachment. An ` llvm.dbg.assign ` that uses a ` DIAssignID ` that is not attached
141
+ attachment. A ` #dbg_assign ` that uses a ` DIAssignID ` that is not attached
149
142
to any instruction indicates that the memory location isn’t valid.
150
143
151
144
** Merging** stores: In many cases no change is required as ` DIAssignID `
152
145
attachments are automatically merged if ` combineMetadata ` is called. One way or
153
146
another, the ` DIAssignID ` attachments must be merged such that new store
154
- becomes linked to all the ` llvm.dbg.assign ` intrinsics that the merged stores
147
+ becomes linked to all the ` #dbg_assign ` records that the merged stores
155
148
were linked to. This can be achieved simply by calling a helper function
156
149
` Instruction::mergeDIAssignID ` .
157
150
158
- ** Inlining** stores: As stores are inlined we generate ` llvm.dbg.assign `
159
- intrinsics and ` DIAssignID ` attachments as if the stores represent source
151
+ ** Inlining** stores: As stores are inlined we generate ` #dbg_assign `
152
+ records and ` DIAssignID ` attachments as if the stores represent source
160
153
assignments, just like the in frontend. This isn’t perfect, as stores may have
161
154
been moved, modified or deleted before inlining, but it does at least keep the
162
155
information about the variable correct within the non-inlined scope.
163
156
164
- ** Splitting** stores: SROA and passes that split stores treat ` llvm.dbg.assign `
165
- intrinsics similarly to ` llvm.dbg.declare ` intrinsics . Clone the
166
- ` llvm.dbg.assign ` intrinsics linked to the store, update the FragmentInfo in
167
- the ` ValueExpression ` , and give the split stores (and cloned intrinsics ) new
157
+ ** Splitting** stores: SROA and passes that split stores treat ` #dbg_assign `
158
+ records similarly to ` #dbg_declare ` records . Clone the
159
+ ` #dbg_assign ` records linked to the store, update the FragmentInfo in
160
+ the ` ValueExpression ` , and give the split stores (and cloned records ) new
168
161
` DIAssignID ` attachments each. In other words, treat the split stores as
169
162
separate assignments. For partial DSE (e.g. shortening a memset), we do the
170
- same except that ` llvm.dbg.assign ` for the dead fragment gets an ` Undef `
163
+ same except that ` #dbg_assign ` for the dead fragment gets an ` Undef `
171
164
` Address ` .
172
165
173
- ** Promoting** allocas and store/loads: ` llvm.dbg.assign ` intrinsics implicitly
166
+ ** Promoting** allocas and store/loads: ` #dbg_assign ` records implicitly
174
167
describe joined values in memory locations at CFG joins, but this is not
175
168
necessarily the case after promoting (or partially promoting) the
176
169
variable. Passes that promote variables are responsible for inserting
177
- ` llvm.dbg.assign ` intrinsics after the resultant PHIs generated during
178
- promotion. ` mem2reg ` already has to do this (with ` llvm.dbg.value ` ) for
179
- ` llvm.dbg.declare ` s. Where a store has no linked intrinsic , the store is
170
+ ` #dbg_assign ` records after the resultant PHIs generated during
171
+ promotion. ` mem2reg ` already has to do this (with ` #dbg_value ` ) for
172
+ ` #dbg_declare ` s. Where a store has no linked record , the store is
180
173
assumed to represent an assignment for variables stored at the destination
181
174
address.
182
175
183
- #### Debug intrinsic updates
176
+ #### Debug record updates
184
177
185
- ** Moving** a debug intrinsic : avoid moving ` llvm.dbg.assign ` intrinsics where
178
+ ** Moving** a debug record : avoid moving ` #dbg_assign ` records where
186
179
possible, as they represent a source-level assignment, whose position in the
187
180
program should not be affected by optimization passes.
188
181
189
- ** Deleting** a debug intrinsic : Nothing new to do. Just like for conventional
190
- debug intrinsics , unless it is unreachable, it’s almost always incorrect to
191
- delete a ` llvm.dbg.assign ` intrinsic .
182
+ ** Deleting** a debug record : Nothing new to do. Just like for conventional
183
+ debug records , unless it is unreachable, it’s almost always incorrect to
184
+ delete a ` #dbg_assign ` record .
192
185
193
- ### Lowering ` llvm.dbg.assign ` to MIR
186
+ ### Lowering ` #dbg_assign ` to MIR
194
187
195
- To begin with only SelectionDAG ISel will be supported. ` llvm.dbg.assign `
196
- intrinsics are lowered to MIR ` DBG_INSTR_REF ` instructions. Before this happens
188
+ To begin with only SelectionDAG ISel will be supported. ` #dbg_assign `
189
+ records are lowered to MIR ` DBG_INSTR_REF ` instructions. Before this happens
197
190
we need to decide where it is appropriate to use memory locations and where we
198
191
must use a non-memory location (or no location) for each variable. In order to
199
192
make those decisions we run a standard fixed-point dataflow analysis that makes
@@ -214,9 +207,9 @@ to tackle:
214
207
clang/test/CodeGen/assignment-tracking/assignment-tracking.cpp for examples.
215
208
216
209
* ` trackAssignments ` doesn't yet work for variables that have their
217
- ` llvm.dbg.declare ` location modified by a ` DIExpression ` , e.g. when the
210
+ ` #dbg_declare ` location modified by a ` DIExpression ` , e.g. when the
218
211
address of the variable is itself stored in an ` alloca ` with the
219
- ` llvm.dbg.declare ` using ` DIExpression(DW_OP_deref) ` . See ` indirectReturn ` in
212
+ ` #dbg_declare ` using ` DIExpression(DW_OP_deref) ` . See ` indirectReturn ` in
220
213
llvm/test/DebugInfo/Generic/assignment-tracking/track-assignments.ll and in
221
214
clang/test/CodeGen/assignment-tracking/assignment-tracking.cpp for an
222
215
example.
@@ -225,13 +218,13 @@ to tackle:
225
218
memory location is available without using a ` DIAssignID ` . This is because
226
219
the storage address is not computed by an instruction (it's an argument
227
220
value) and therefore we have nowhere to put the metadata attachment. To solve
228
- this we probably need another marker intrinsic to denote "the variable's
229
- stack home is X address" - similar to ` llvm.dbg.declare ` except that it needs
230
- to compose with ` llvm.dbg.assign ` intrinsics such that the stack home address
231
- is only selected as a location for the variable when the ` llvm.dbg.assign `
232
- intrinsics agree it should be.
221
+ this we probably need another marker record to denote "the variable's
222
+ stack home is X address" - similar to ` #dbg_declare ` except that it needs
223
+ to compose with ` #dbg_assign ` records such that the stack home address
224
+ is only selected as a location for the variable when the ` #dbg_assign `
225
+ records agree it should be.
233
226
234
- * Given the above (a special "the stack home is X" intrinsic ), and the fact
227
+ * Given the above (a special "the stack home is X" record ), and the fact
235
228
that we can only track assignments with fixed offsets and sizes, I think we
236
229
can probably get rid of the address and address-expression part, since it
237
230
will always be computable with the info we have.
0 commit comments