Skip to content

Commit 4c9e11a

Browse files
Keenutss-perron
andauthored
[SPIR-V] Add proposal for global & local variable address spaces (#111)
This commit adds a proposal on how to implement local and global variables in the SPIR-V backend given HLSL put them in the same address space while SPIR-V requires them to be in 2 distinct ones. --------- Signed-off-by: Nathan Gauër <[email protected]> Co-authored-by: Steven Perron <[email protected]>
1 parent 3b17c8d commit 4c9e11a

File tree

1 file changed

+206
-0
lines changed

1 file changed

+206
-0
lines changed
Lines changed: 206 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,206 @@
1+
# SPIRV - Variable address space
2+
3+
* Proposal: [NNNN](NNNN-SPIRV-variable-address-space.md)
4+
* Author(s): [Nathan Gauër](https://github.com/Keenuts)
5+
* Status: **Design In Progress**
6+
7+
## Introduction
8+
9+
From the HLSL spec:
10+
11+
> HLSL programs manipulates data stored in four distinct memory spaces: thread, threadgroup, device and constant.
12+
13+
Those four groups represents the user-facing semantic, and the group this
14+
proposal will focus on is `thread`.
15+
Following this model, a function local variable and a static global variable
16+
share the same address space.
17+
18+
On the logical SPIR-V side, variables are attached to a storage class. This
19+
is a different name to represent the same thing: an address space.
20+
- A pointer to one storage class is incompatible with a pointer to another.
21+
22+
This proposal will use address space when speaking in HLSL/LLVM-IR terms, and
23+
storage class when speaking in SPIR-V terms.
24+
We will not mention C/HLSL style storage classes (static, volatile, etc).
25+
26+
SPIR-V has 2 interesting storage classes:
27+
- Function
28+
- Private
29+
A variable declared with the `Function` storage class must be declared in
30+
the first basic block of a function. It is normaly used to represent function
31+
local variables.
32+
33+
A variable declared with the `Private` storage class is private to the current
34+
invocation/thread, but belongs to the global scope.
35+
This would be the equivalent of a static global variable in HLSL.
36+
37+
Reconciliating the SPIR-V & HLSL side could be done in two ways:
38+
- unify the storage classes in SPIR-V.
39+
- separate the address spaces in HLSL.
40+
41+
Implementing constant buffers & other resources is done by creating new
42+
address spaces, making explicit the constraints some allocations have.
43+
Thus, it seems separating the address spaces for globals & locals would
44+
allow us to stay consistent with the rest of the language.
45+
46+
## HLSL patterns to look for
47+
48+
This section will explain why some HLSL patterns are hard to lower to SPIR-V.
49+
50+
Note: HLSL does not implement references yet, but we have to make sure our
51+
design would allow us to implement them. For this reason, we'll assume HLSL
52+
has references.
53+
54+
### Example 1:
55+
56+
```hlsl
57+
static int a = 0;
58+
59+
void foo() {
60+
int b = 0;
61+
}
62+
```
63+
64+
`a` and `b` both share the same address space. But on the SPIR-V side, `a`
65+
must be a `Private` variable, while `b` must be a `Function` variable.
66+
This requires the lowering pass to know the context of a variable.
67+
68+
### Example 2:
69+
70+
```hlsl
71+
static int a = 0;
72+
73+
void foo() {
74+
int& ref = a;
75+
int b = ref;
76+
}
77+
```
78+
79+
`a` is still `Private`, `b` still `Function`. But `ref` points to `a`.
80+
In SPIR-V, a variable cannot store a pointer pointing to another storage class.
81+
This means `ref` cannot be stored in a variable in the `Function` class.
82+
If `a` is `Private`, `ref` could only be declared as `Private`.
83+
84+
### Example 3:
85+
86+
```hlsl
87+
static int global = 0;
88+
89+
int& foo(int& input, int select) {
90+
return select ? input : global;
91+
}
92+
93+
void main(int select) {
94+
int local;
95+
int& res1 = foo(local, select);
96+
int& res2 = foo(global, select);
97+
}
98+
```
99+
100+
`global` is still `Private`.
101+
`local` is `Function`.
102+
In SPIR-V, function declarations contains the return and parameters types,
103+
including the storage classes.
104+
This means, depending on the call-site, and the value of `select`, the
105+
return value and parameter would required either the `Function` or the
106+
`Private` storage class. When this selection depends on a runtime condition,
107+
this cannot be lowered to SPIR-V as-is.
108+
109+
## Proposed solution: using 2 HLSL address spaces
110+
111+
Thread-local, global variables will be put in the `hlsl_private` address
112+
space. Thread-local function-local variables will be put in the `default`
113+
address space.
114+
115+
## Implementing the solution
116+
117+
A new address space will be added to Clang: `hlsl_private`.
118+
This address space will be mapped to `Spirv::Private` on the SPIR-V backend,
119+
`PRIVATE_ADDRESS` for AMDGPU, and the address space `0` in DXIL.
120+
121+
Clang codegen will add the new address space annotations, separating
122+
the `private` from the default.
123+
124+
For the time being, the `private` address space will be marked as a subset of
125+
the `default` address space, allowing overload resolution for class methods:
126+
- an object in the `private` address space will be allowed to use a method
127+
declared with a `this` in the default address space.
128+
129+
Clang will emit an `addrspacecast` we will have to handle, but that's a known
130+
issue in address-space overload resolution, and not new to this proposal.
131+
132+
## Alternative design considered
133+
134+
### Force optimizations, and force inlining
135+
136+
One solution is to inline all function call, even those marked as
137+
noinline. Then replicate all instruction that use pointers so that the
138+
pointer operand has a single known address space. If those transformations
139+
were applied, we could avoid address-conflict mismatch for pointers, and all
140+
we'd have are direct load/stores to global/local variables.
141+
Functions returning incompatible references wouldn't exist, allowing us to
142+
generate valid SPIR-V.
143+
144+
Note that those transformations can get arbitrarily complex. The number of
145+
copies is exponential in regards to the number of pointer operands.
146+
147+
Additionally:
148+
- HLSL allows using `noinline`: we would have to ignore it.
149+
- HLSL allows exporting functions to compile to a library: if we need to
150+
inline to generate functions, we cannot emit libraries exposing such
151+
functions.
152+
- Runtime conditions causing address-space conflict would require code
153+
duplication.
154+
- It makes reading the generated assembly harder.
155+
156+
### Move all variables to the function scope
157+
158+
HLSL static globals have a known initialization value at compile-time.
159+
Meaning we could move the global variables to the entrypoint first basic
160+
block, as local variables.
161+
If SPIR-V has no global variables, all pointers as `Function`.
162+
This would require passing references to other functions referencing those
163+
globals, or inline them, but it would be possible.
164+
165+
But the blocker remains the same: building to a library function.
166+
If an exported function references a global variable, we cannot change
167+
the signature of the function.
168+
169+
## Move all variables to the global scope
170+
171+
By moving all local variables to the global scope, we now have a single
172+
storage class `Private`, and won't have conflict issues.
173+
This also allows us to compile non-optimized code, and to keep functions if
174+
required.
175+
176+
HLSL & SPIR-V disallow static recursion. Meaning we know at compile-time
177+
that each function requires one instance of each local variable.
178+
This would also work with exported functions: static recursion is still not
179+
allowed, so cross compile-units recursion is not an issue.
180+
181+
The main issue of this solution can have are:
182+
- drivers may have a harder time figuring out variable lifetimes.
183+
- SPIR-V has a hard 65536 global variable limit (vs 500k local variables).
184+
185+
I believe those 2 are not hard blockers, but something we need to be aware of.
186+
187+
## Selectively move variables to the global scope.
188+
189+
If a variable is only loaded/stored from/to, and remains in the function
190+
scope, there should be no pointer incompatibility.
191+
This means we could potentially implement the solution 4, but only targeting
192+
variables for which addresses are moved across their function scope
193+
boundaries.
194+
195+
This would require additional IR analysis, as we would need to determine
196+
which address is used in another scope to recreate a global variable.
197+
198+
The motivation we could have for such solution are:
199+
- if drivers have a hard time optimizing the global variables.
200+
- if the global variable count limit becomes an issue.
201+
202+
Implementing this solution is more complex, and could be more error prone,
203+
so until we have a real need, I would recommend against, and moving forward
204+
with solution 4. If the need comes, moving from solution 4 to solution 5
205+
would be possible, as it's just an optimization on top.
206+

0 commit comments

Comments
 (0)