@@ -183,53 +183,54 @@ def XeGPU_LayoutAttr : XeGPUAttr<"Layout", "layout"> {
183
183
1-dimensional layout. The first dimension in the order list is the fastest-changing dimension. If it
184
184
is not present, the default value is [1, 0].
185
185
186
- ### Examples:
187
- 1. Subgroup level layout:
188
- ```mlir
189
- #xegpu.layout<lane_layout = [2, 8], lane_data = [1, 1]>
190
- ```
191
- In this example, there are 16 work-items per subgroup, and is organized as
192
- [[0, 1, 2, .., 7],[8, 9, .., 15]]. The distribution unit is 1x1.
193
-
194
- 2. Subgroup level layout with order:
195
- ```mlir
196
- #xegpu.layout<lane_layout = [2, 8], lane_data = [1, 1], order = [0, 1]>
197
- ```
198
- In this example, there are 16 work-items per subgroup, and is organized as
199
- [[0, 2, 4, ..., 14], [1, 3, 5, ..., 15]]. The distribution unit is 1x1.
200
-
201
- 3. Subgroup level layout with inst_data
202
- ```mlir
203
- #xegpu.layout<inst_data = [8, 16], lane_layout = [2, 8], lane_data = [2, 2]>
204
- ```
205
- In this example, the original problem size is partitioned into smaller subproblems of dimensions [8, 16],
206
- which are then distributed among 16 work-items arranged as [[0, 1, 2, ..., 7], [8, 9, ..., 15]]. Each
207
- work-item is assigned four 2x2 blocks in a round-robin manner.
208
-
209
- 4. Workgroup level layout:
210
- ```mlir
211
- #xegpu.layout<sg_layout = [2, 4], sg_data = [16, 16], lane_layout = [2, 8], lane_data = [1, 1]>
212
- ```
213
- In this example, the layout represents a workgroup distribution. A workgroup consists of 8 subgroups
214
- arranged as [[0, 1, 2, 3], [4, 5, 6, 7]]. Each subgroup accesses a 16x16 block per instruction, which
215
- is further distributed to 16 work items which is organized as [[0, 1, 2, .., 7],[8, 9, .., 15]].
216
-
217
- 5. Workgroup level layout with order:
218
- ```mlir
219
- #xegpu.layout<sg_layout = [2, 4], sg_data = [16, 16], lane_layout = [2, 8], lane_data = [1, 1], order = [0, 1]>
220
- ```
221
- In this example, the layout represents a workgroup distribution. A workgroup consists of 8 subgroups
222
- arranged as [[0, 2, 4, 6], [1, 3, 5, 7]]. Each subgroup accesses a 16x16 block per instruction, which
223
- is further distributed to 16 work items which is organized as [[0, 2, 4, ..., 14], [1, 3, 5, ..., 15]].
224
-
225
- 6. Workgroup level layout with inst_data:
226
- ```mlir
227
- #xegpu.layout<sg_layout = [2, 4], sg_data = [16, 16], inst_data = [8, 16], lane_layout = [2, 8], lane_data = [1, 1]>
228
- ```
229
- This example is similar to the previous ones, but the `inst_data` parameter divides `sg_data` into two instructions,
230
- each processing an 8x16 block. These blocks are further distributed across 16 work-items with a distribution unit of 1x1.
231
- Unlike the 2x2 distribution unit in example 3, which results in accessing contiguous 2x2 blocks, the 1x1 distribution
232
- unit may result in non-contiguous access.
186
+ Examples:
187
+
188
+ 1. Subgroup level layout:
189
+ ```mlir
190
+ #xegpu.layout<lane_layout = [2, 8], lane_data = [1, 1]>
191
+ ```
192
+ In this example, there are 16 work-items per subgroup, and is organized as
193
+ [[0, 1, 2, .., 7],[8, 9, .., 15]]. The distribution unit is 1x1.
194
+
195
+ 2. Subgroup level layout with order:
196
+ ```mlir
197
+ #xegpu.layout<lane_layout = [2, 8], lane_data = [1, 1], order = [0, 1]>
198
+ ```
199
+ In this example, there are 16 work-items per subgroup, and is organized as
200
+ [[0, 2, 4, ..., 14], [1, 3, 5, ..., 15]]. The distribution unit is 1x1.
201
+
202
+ 3. Subgroup level layout with inst_data
203
+ ```mlir
204
+ #xegpu.layout<inst_data = [8, 16], lane_layout = [2, 8], lane_data = [2, 2]>
205
+ ```
206
+ In this example, the original problem size is partitioned into smaller subproblems of dimensions [8, 16],
207
+ which are then distributed among 16 work-items arranged as [[0, 1, 2, ..., 7], [8, 9, ..., 15]]. Each
208
+ work-item is assigned four 2x2 blocks in a round-robin manner.
209
+
210
+ 4. Workgroup level layout:
211
+ ```mlir
212
+ #xegpu.layout<sg_layout = [2, 4], sg_data = [16, 16], lane_layout = [2, 8], lane_data = [1, 1]>
213
+ ```
214
+ In this example, the layout represents a workgroup distribution. A workgroup consists of 8 subgroups
215
+ arranged as [[0, 1, 2, 3], [4, 5, 6, 7]]. Each subgroup accesses a 16x16 block per instruction, which
216
+ is further distributed to 16 work items which is organized as [[0, 1, 2, .., 7],[8, 9, .., 15]].
217
+
218
+ 5. Workgroup level layout with order:
219
+ ```mlir
220
+ #xegpu.layout<sg_layout = [2, 4], sg_data = [16, 16], lane_layout = [2, 8], lane_data = [1, 1], order = [0, 1]>
221
+ ```
222
+ In this example, the layout represents a workgroup distribution. A workgroup consists of 8 subgroups
223
+ arranged as [[0, 2, 4, 6], [1, 3, 5, 7]]. Each subgroup accesses a 16x16 block per instruction, which
224
+ is further distributed to 16 work items which is organized as [[0, 2, 4, ..., 14], [1, 3, 5, ..., 15]].
225
+
226
+ 6. Workgroup level layout with inst_data:
227
+ ```mlir
228
+ #xegpu.layout<sg_layout = [2, 4], sg_data = [16, 16], inst_data = [8, 16], lane_layout = [2, 8], lane_data = [1, 1]>
229
+ ```
230
+ This example is similar to the previous ones, but the `inst_data` parameter divides `sg_data` into two instructions,
231
+ each processing an 8x16 block. These blocks are further distributed across 16 work-items with a distribution unit of 1x1.
232
+ Unlike the 2x2 distribution unit in example 3, which results in accessing contiguous 2x2 blocks, the 1x1 distribution
233
+ unit may result in non-contiguous access.
233
234
}];
234
235
235
236
let parameters = (ins
0 commit comments