Skip to content

Commit 3f9f64a

Browse files
arsenmpravinjagtap
authored andcommitted
AMDGPU: Add a baseline, non-comprehensive test for scaled mfma hazards (llvm#117055)
Add some tests which will demonstrate that we treat the number of cycles differently depending on whether the first matrix uses an f8 format.
1 parent 787b1f8 commit 3f9f64a

File tree

2 files changed

+275
-1
lines changed

2 files changed

+275
-1
lines changed

llvm/test/CodeGen/AMDGPU/mai-hazards-gfx940.mir

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2199,7 +2199,7 @@ name: xdl_mfma_4pass_write_vgpr_sgemm_mfma_read_overlap_srcb
21992199
body: |
22002200
bb.0:
22012201
$vgpr0_vgpr1_vgpr2_vgpr3 = V_MFMA_F32_16X16X16F16_vgprcd_e64 $vgpr4_vgpr5, $vgpr6_vgpr7, $vgpr0_vgpr1_vgpr2_vgpr3, 1, 2, 3, implicit $mode, implicit $exec
2202-
$vgpr0_vgpr1_vgpr2_vgpr3 = V_MFMA_F32_4X4X1F32_vgprcd_e64 $vgpr8, $vgpr1, $vgpr6_vgpr7_vgpr8_vgpr9, 0, 0, 0, implicit $mode, implicit $exec
2202+
$vgpr0_vgpr1_vgpr2_vgpr3 = V_MFMA_F32_4X4X1F32_vgprcd_e64 $vgpr8, $vgpr1, $vgpr2_vgpr3_vgpr4_vgpr5, 0, 0, 0, implicit $mode, implicit $exec
22032203
22042204
...
22052205

0 commit comments

Comments
 (0)