-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[AMDGPU][MC] Add dpp for V_PK_FMAC_F16 for GFX10 #79598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
# RUN: llc -mtriple=amdgcn -mcpu=gfx1010 -run-pass=gcn-dpp-combine -verify-machineinstrs -o - %s | FileCheck %s -check-prefixes=GCN | ||
|
||
# GCN-LABEL: name: v_pk_fmac_f16 | ||
# GCN: %4:vgpr_32 = IMPLICIT_DEF | ||
# GCN: %3:vgpr_32 = V_PK_FMAC_F16_dpp %4, 0, %1, 0, %1, 1, 15, 15, 1, implicit $mode, implicit $exec | ||
name: v_pk_fmac_f16 | ||
tracksRegLiveness: true | ||
body: | | ||
bb.0: | ||
liveins: $vgpr0, $vgpr1 | ||
|
||
%0:vgpr_32 = COPY $vgpr0 | ||
%1:vgpr_32 = COPY $vgpr1 | ||
|
||
%2:vgpr_32 = V_MOV_B32_dpp %0, %1, 1, 15, 15, 1, implicit $exec | ||
%3:vgpr_32 = V_PK_FMAC_F16_e32 %2, %1, implicit $mode, implicit $exec | ||
... |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -224,3 +224,6 @@ v_sub_f16_dpp v5, v1, v255 dpp8:[7,6,5,4,3,2,1,0] | |
|
||
v_subrev_f16_dpp v5, v1, v255 dpp8:[7,6,5,4,3,2,1,0] | ||
// GFX12: :[[@LINE-1]]:{{[0-9]+}}: error: operands are not valid for this GPU or mode | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you please move this test to a different file? It is packed, so not a true16 instruction. There isn't a gfx12_asm_vop2_err.s, so perhaps you can create it. |
||
v_pk_fmac_f16_dpp v5, v1, v2 quad_perm:[0,1,2,3] row_mask:0x0 bank_mask:0x3 | ||
// GFX12: :[[@LINE-1]]:{{[0-9]+}}: error: dpp variant of this instruction is not supported |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -291,4 +291,4 @@ v_pk_add_u16 v5, v1, 123456.0 | |
// FIXME: v_pk_fmac_f16 cannot be promoted to VOP3 so '_e32' suffix is not valid | ||
v_pk_fmac_f16 v5, 0x12345678, v2 | ||
// NOGFX9: :[[@LINE-1]]:{{[0-9]+}}: error: instruction not supported on this GPU | ||
// GFX10: v_pk_fmac_f16 v5, 0x12345678, v2 ; encoding: [0xff,0x04,0x0a,0x78,0x78,0x56,0x34,0x12] | ||
// GFX10: v_pk_fmac_f16_e32 v5, 0x12345678, v2 ; encoding: [0xff,0x04,0x0a,0x78,0x78,0x56,0x34,0x12] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If I have the convention right, as stated in this patch 0f5ebbc, instructions without a VOP3 form should not have _e32. It looks like removing IsSingle caused _e32 to be added to the mnemonic. Please change it back. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add gfx10 error tests that DPP16 NEG and ABS modifiers are not supported by this instruction