|
| 1 | +# SPIRV - Variable address space |
| 2 | + |
| 3 | + * Proposal: [NNNN](NNNN-SPIRV-variable-address-space.md) |
| 4 | + * Author(s): [Nathan Gauër](https://github.com/Keenuts) |
| 5 | + * Status: **Design In Progress** |
| 6 | + |
| 7 | +## Introduction |
| 8 | + |
| 9 | +From the HLSL spec: |
| 10 | + |
| 11 | +> HLSL programs manipulates data stored in four distinct memory spaces: thread, threadgroup, device and constant. |
| 12 | +
|
| 13 | +Those four groups represents the user-facing semantic, and the group this |
| 14 | +proposal will focus on is `thread`. |
| 15 | +Following this model, a function local variable and a static global variable |
| 16 | +share the same address space. |
| 17 | + |
| 18 | +On the logical SPIR-V side, variables are attached to a storage class. This |
| 19 | +is a different name to represent the same thing: an address space. |
| 20 | +- A pointer to one storage class is incompatible with a pointer to another. |
| 21 | + |
| 22 | +This proposal will use address space when speaking in HLSL/LLVM-IR terms, and |
| 23 | +storage class when speaking in SPIR-V terms. |
| 24 | +We will not mention C/HLSL style storage classes (static, volatile, etc). |
| 25 | + |
| 26 | +SPIR-V has 2 interesting storage classes: |
| 27 | + - Function |
| 28 | + - Private |
| 29 | +A variable declared with the `Function` storage class must be declared in |
| 30 | +the first basic block of a function. It is normaly used to represent function |
| 31 | +local variables. |
| 32 | + |
| 33 | +A variable declared with the `Private` storage class is private to the current |
| 34 | +invocation/thread, but belongs to the global scope. |
| 35 | +This would be the equivalent of a static global variable in HLSL. |
| 36 | + |
| 37 | +Reconciliating the SPIR-V & HLSL side could be done in two ways: |
| 38 | + - unify the storage classes in SPIR-V. |
| 39 | + - separate the address spaces in HLSL. |
| 40 | + |
| 41 | +Implementing constant buffers & other resources is done by creating new |
| 42 | +address spaces, making explicit the constraints some allocations have. |
| 43 | +Thus, it seems separating the address spaces for globals & locals would |
| 44 | +allow us to stay consistent with the rest of the language. |
| 45 | + |
| 46 | +## HLSL patterns to look for |
| 47 | + |
| 48 | +This section will explain why some HLSL patterns are hard to lower to SPIR-V. |
| 49 | + |
| 50 | +Note: HLSL does not implement references yet, but we have to make sure our |
| 51 | +design would allow us to implement them. For this reason, we'll assume HLSL |
| 52 | +has references. |
| 53 | + |
| 54 | +### Example 1: |
| 55 | + |
| 56 | +```hlsl |
| 57 | +static int a = 0; |
| 58 | +
|
| 59 | +void foo() { |
| 60 | + int b = 0; |
| 61 | +} |
| 62 | +``` |
| 63 | + |
| 64 | +`a` and `b` both share the same address space. But on the SPIR-V side, `a` |
| 65 | +must be a `Private` variable, while `b` must be a `Function` variable. |
| 66 | +This requires the lowering pass to know the context of a variable. |
| 67 | + |
| 68 | +### Example 2: |
| 69 | + |
| 70 | +```hlsl |
| 71 | +static int a = 0; |
| 72 | +
|
| 73 | +void foo() { |
| 74 | + int& ref = a; |
| 75 | + int b = ref; |
| 76 | +} |
| 77 | +``` |
| 78 | + |
| 79 | +`a` is still `Private`, `b` still `Function`. But `ref` points to `a`. |
| 80 | +In SPIR-V, a variable cannot store a pointer pointing to another storage class. |
| 81 | +This means `ref` cannot be stored in a variable in the `Function` class. |
| 82 | +If `a` is `Private`, `ref` could only be declared as `Private`. |
| 83 | + |
| 84 | +### Example 3: |
| 85 | + |
| 86 | +```hlsl |
| 87 | +static int global = 0; |
| 88 | +
|
| 89 | +int& foo(int& input, int select) { |
| 90 | + return select ? input : global; |
| 91 | +} |
| 92 | +
|
| 93 | +void main(int select) { |
| 94 | + int local; |
| 95 | + int& res1 = foo(local, select); |
| 96 | + int& res2 = foo(global, select); |
| 97 | +} |
| 98 | +``` |
| 99 | + |
| 100 | +`global` is still `Private`. |
| 101 | +`local` is `Function`. |
| 102 | +In SPIR-V, function declarations contains the return and parameters types, |
| 103 | +including the storage classes. |
| 104 | +This means, depending on the call-site, and the value of `select`, the |
| 105 | +return value and parameter would required either the `Function` or the |
| 106 | +`Private` storage class. When this selection depends on a runtime condition, |
| 107 | +this cannot be lowered to SPIR-V as-is. |
| 108 | + |
| 109 | +## Proposed solution: using 2 HLSL address spaces |
| 110 | + |
| 111 | +Thread-local, global variables will be put in the `hlsl_private` address |
| 112 | +space. Thread-local function-local variables will be put in the `default` |
| 113 | +address space. |
| 114 | + |
| 115 | +## Implementing the solution |
| 116 | + |
| 117 | +A new address space will be added to Clang: `hlsl_private`. |
| 118 | +This address space will be mapped to `Spirv::Private` on the SPIR-V backend, |
| 119 | +`PRIVATE_ADDRESS` for AMDGPU, and the address space `0` in DXIL. |
| 120 | + |
| 121 | +Clang codegen will add the new address space annotations, separating |
| 122 | +the `private` from the default. |
| 123 | + |
| 124 | +For the time being, the `private` address space will be marked as a subset of |
| 125 | +the `default` address space, allowing overload resolution for class methods: |
| 126 | +- an object in the `private` address space will be allowed to use a method |
| 127 | + declared with a `this` in the default address space. |
| 128 | + |
| 129 | +Clang will emit an `addrspacecast` we will have to handle, but that's a known |
| 130 | +issue in address-space overload resolution, and not new to this proposal. |
| 131 | + |
| 132 | +## Alternative design considered |
| 133 | + |
| 134 | +### Force optimizations, and force inlining |
| 135 | + |
| 136 | +One solution is to inline all function call, even those marked as |
| 137 | +noinline. Then replicate all instruction that use pointers so that the |
| 138 | +pointer operand has a single known address space. If those transformations |
| 139 | +were applied, we could avoid address-conflict mismatch for pointers, and all |
| 140 | +we'd have are direct load/stores to global/local variables. |
| 141 | +Functions returning incompatible references wouldn't exist, allowing us to |
| 142 | +generate valid SPIR-V. |
| 143 | + |
| 144 | +Note that those transformations can get arbitrarily complex. The number of |
| 145 | +copies is exponential in regards to the number of pointer operands. |
| 146 | + |
| 147 | +Additionally: |
| 148 | +- HLSL allows using `noinline`: we would have to ignore it. |
| 149 | +- HLSL allows exporting functions to compile to a library: if we need to |
| 150 | + inline to generate functions, we cannot emit libraries exposing such |
| 151 | + functions. |
| 152 | +- Runtime conditions causing address-space conflict would require code |
| 153 | + duplication. |
| 154 | +- It makes reading the generated assembly harder. |
| 155 | + |
| 156 | +### Move all variables to the function scope |
| 157 | + |
| 158 | +HLSL static globals have a known initialization value at compile-time. |
| 159 | +Meaning we could move the global variables to the entrypoint first basic |
| 160 | +block, as local variables. |
| 161 | +If SPIR-V has no global variables, all pointers as `Function`. |
| 162 | +This would require passing references to other functions referencing those |
| 163 | +globals, or inline them, but it would be possible. |
| 164 | + |
| 165 | +But the blocker remains the same: building to a library function. |
| 166 | +If an exported function references a global variable, we cannot change |
| 167 | +the signature of the function. |
| 168 | + |
| 169 | +## Move all variables to the global scope |
| 170 | + |
| 171 | +By moving all local variables to the global scope, we now have a single |
| 172 | +storage class `Private`, and won't have conflict issues. |
| 173 | +This also allows us to compile non-optimized code, and to keep functions if |
| 174 | +required. |
| 175 | + |
| 176 | +HLSL & SPIR-V disallow static recursion. Meaning we know at compile-time |
| 177 | +that each function requires one instance of each local variable. |
| 178 | +This would also work with exported functions: static recursion is still not |
| 179 | +allowed, so cross compile-units recursion is not an issue. |
| 180 | + |
| 181 | +The main issue of this solution can have are: |
| 182 | +- drivers may have a harder time figuring out variable lifetimes. |
| 183 | +- SPIR-V has a hard 65536 global variable limit (vs 500k local variables). |
| 184 | + |
| 185 | +I believe those 2 are not hard blockers, but something we need to be aware of. |
| 186 | + |
| 187 | +## Selectively move variables to the global scope. |
| 188 | + |
| 189 | +If a variable is only loaded/stored from/to, and remains in the function |
| 190 | +scope, there should be no pointer incompatibility. |
| 191 | +This means we could potentially implement the solution 4, but only targeting |
| 192 | +variables for which addresses are moved across their function scope |
| 193 | +boundaries. |
| 194 | + |
| 195 | +This would require additional IR analysis, as we would need to determine |
| 196 | +which address is used in another scope to recreate a global variable. |
| 197 | + |
| 198 | +The motivation we could have for such solution are: |
| 199 | +- if drivers have a hard time optimizing the global variables. |
| 200 | +- if the global variable count limit becomes an issue. |
| 201 | + |
| 202 | +Implementing this solution is more complex, and could be more error prone, |
| 203 | +so until we have a real need, I would recommend against, and moving forward |
| 204 | +with solution 4. If the need comes, moving from solution 4 to solution 5 |
| 205 | +would be possible, as it's just an optimization on top. |
| 206 | + |
0 commit comments