-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[GISEL][RISCV] IRTranslator for scalable vector load #80006
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
23d4d23
to
e372967
Compare
✅ With the latest revision this PR passed the C/C++ code formatter. |
@llvm/pr-subscribers-backend-risc-v @llvm/pr-subscribers-llvm-globalisel Author: Jiahan Xie (jiahanxie353) ChangesHi @michaelmaitland @topperc , this patch aims at translating LLVM IR with scalable vector load into Full diff: https://github.com/llvm/llvm-project/pull/80006.diff 4 Files Affected:
diff --git a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
index dd38317c26bf..72a8b0b001d0 100644
--- a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
@@ -1363,9 +1363,8 @@ static bool isSwiftError(const Value *V) {
bool IRTranslator::translateLoad(const User &U, MachineIRBuilder &MIRBuilder) {
const LoadInst &LI = cast<LoadInst>(U);
-
- unsigned StoreSize = DL->getTypeStoreSize(LI.getType());
- if (StoreSize == 0)
+ TypeSize StoreSize = DL->getTypeStoreSize(LI.getType());
+ if (StoreSize.isZero())
return true;
ArrayRef<Register> Regs = getOrCreateVRegs(LI);
diff --git a/llvm/lib/CodeGen/MachineVerifier.cpp b/llvm/lib/CodeGen/MachineVerifier.cpp
index b182000a3d70..6150cdf7d47f 100644
--- a/llvm/lib/CodeGen/MachineVerifier.cpp
+++ b/llvm/lib/CodeGen/MachineVerifier.cpp
@@ -1192,7 +1192,7 @@ void MachineVerifier::verifyPreISelGenericInstruction(const MachineInstr *MI) {
if (MMO.getSizeInBits() >= ValTy.getSizeInBits())
report("Generic extload must have a narrower memory type", MI);
} else if (MI->getOpcode() == TargetOpcode::G_LOAD) {
- if (MMO.getSize() > ValTy.getSizeInBytes())
+ if (TypeSize::isKnownGT(MMO.getMemoryType().getSizeInBytes(), ValTy.getSizeInBytes()))
report("load memory size cannot exceed result size", MI);
} else if (MI->getOpcode() == TargetOpcode::G_STORE) {
if (ValTy.getSizeInBytes() < MMO.getSize())
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 82836346d883..ab9958111491 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -20532,11 +20532,12 @@ unsigned RISCVTargetLowering::getCustomCtpopCost(EVT VT,
bool RISCVTargetLowering::fallBackToDAGISel(const Instruction &Inst) const {
- // GISel support is in progress or complete for G_ADD, G_SUB, G_AND, G_OR, and
- // G_XOR.
+ // GISel support is in progress or complete for G_ADD, G_SUB, G_AND, G_OR,
+ // G_XOR, and G_LOAD
unsigned Op = Inst.getOpcode();
if (Op == Instruction::Add || Op == Instruction::Sub ||
- Op == Instruction::And || Op == Instruction::Or || Op == Instruction::Xor)
+ Op == Instruction::And || Op == Instruction::Or ||
+ Op == Instruction::Xor || Op == Instruction::Load)
return false;
if (Inst.getType()->isScalableTy())
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/irtranslator/vec-ld.ll b/llvm/test/CodeGen/RISCV/GlobalISel/irtranslator/vec-ld.ll
new file mode 100644
index 000000000000..5f98c6a7066c
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/irtranslator/vec-ld.ll
@@ -0,0 +1,7 @@
+; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+; RUN: llc -mtriple=riscv32 -mattr=+v -global-isel -stop-after=irtranslator -verify-machineinstrs < %s | FileCheck -check-prefixes=RV32I %s
+
+define void @vload_vint8m1(ptr %pa) {
+ %va = load <vscale x 8 x i8>, ptr %pa
+ ret void
+}
|
; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py | ||
; RUN: llc -mtriple=riscv32 -mattr=+v -global-isel -stop-after=irtranslator -verify-machineinstrs < %s | FileCheck -check-prefixes=RV32I %s | ||
|
||
define void @vload_vint8m1(ptr %pa) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is missing test checks. Also, I think we need to test a more complete set of scalable vector types. Lastly, I think we need test checks for rv64 as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll add more tests with different types and to include rv64 if this simplest test case makes sense.
Right now, I'm wondering, when I use llvm/utils/update_mir_test_checks.py
, the vec-ll.s
file auto-generate align 8
at the end of %va = load <vscale x 8 x i8>, ptr %pa
, so I appended it. Does it make sense to add align 8
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should have tests that include the align argument and tests that do not include the align argument.
We probably need to add support for both aligned and unaligned load/store. It is probably okay to start off with only worrying about aligned loads/stores. I would have thought that align 8
and align 4
is aligned on rv32 and align 8
is aligned on rv64, but llvm/test/CodeGen/RISCV/rvv/unaligned-loads-stores.ll::aligned_load_nxv1i32_a4
seems to imply that align 4
is also aligned on rv64 -- @topperc are you able eto clarify this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alignment is relative to EEW not XLen.
If an element accessed by a vector memory instruction is not naturally aligned to the size of the element, either the element is transferred successfully or an address misaligned exception is raised on that element.
e372967
to
10072ed
Compare
llvm/lib/CodeGen/MachineOperand.cpp
Outdated
@@ -1240,7 +1240,7 @@ void MachineMemOperand::print(raw_ostream &OS, ModuleSlotTracker &MST, | |||
<< "unknown-address"; | |||
} | |||
MachineOperand::printOperandOffset(OS, getOffset()); | |||
if (getSize() > 0 && getAlign() != getSize()) | |||
if (getMemoryType().getSizeInBytes().getKnownMinValue() > 0 && getAlign() != getMemoryType().getSizeInBytes().getKnownMinValue()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not very confident with this approach. I'm a bit worried if it would this breaks scalar cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this check is to emit an align
operand in the case that the alignment is not the same size as the element size.
That means if the memory type is nxvMxsN
that we want this guard to be true when N != Align * 8
. N is the size in bytes of the element type. I think you are checking M
here.
10072ed
to
d8f1211
Compare
llvm/lib/CodeGen/MachineOperand.cpp
Outdated
@@ -1240,7 +1240,7 @@ void MachineMemOperand::print(raw_ostream &OS, ModuleSlotTracker &MST, | |||
<< "unknown-address"; | |||
} | |||
MachineOperand::printOperandOffset(OS, getOffset()); | |||
if (getSize() > 0 && getAlign() != getSize()) | |||
if (getType().getElementCount().getKnownMinValue() > 0 && getAlign() != getType().getElementCount().getKnownMinValue()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (getType().getElementCount().getKnownMinValue() > 0 && getAlign() != getType().getElementCount().getKnownMinValue()) | |
unsigned MinSize = getType().getSizeInBytes().getKnownMinValue(); | |
if (MinSize > 0 && getAlign() != MinSize) |
I think this should solve a large chunk of the test failures. The old code is saying if (the size of the memory reference is not equal to the alignment)
.
The code you have here is saying if (the minimum number of elements is not equal to the alignment)
.
However, we probably need to account for the size of the element as well. The tricky part here is that we only know the minimum size of the memory reference. In the case that the MinSize is not equal to the alignment, it can't hurt to emit an explicit align value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getSize()
checks isValid()
so it probably makes sense to change my suggestion to
if (getType().getElementCount().getKnownMinValue() > 0 && getAlign() != getType().getElementCount().getKnownMinValue()) | |
uint64_t MinSize = | |
MemoryType.isValid() ? getType().getSizeInBytes().getKnownMinValue() : ~UINT64_C(0); | |
if (MinSize > 0 && getAlign() != MinSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And maybe to leave a TODO for getSize()
to return a TypeSize
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getSize()
checksisValid()
so it probably makes sense to change my suggestion to
Yes totally! Actuall I was just trying to push up to use GitHub actions to check because for some reason when I tried to test the test_g_load.mir file locally, I got errors like:
*** Bad machine code: atomic load cannot use release ordering ***
- function: test_load
- basic block: %bb.0 (0x560485132100)
- instruction: %5:_(s32) = G_LOAD %2:_(p0) :: (load acq_rel (s32))
LLVM ERROR: Found 5 machine code errors.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
But GitHub action is no longer reporting that so I'm a bit confused.
The command I used in the terminal is to first change my build to: -DLLVM_TARGETS_TO_BUILD="RISCV;AArch64;ARM"
; and then I used ./build-llc/bin/llc -mtriple=arm64 -run-pass=none -verify-machineinstrs llvm/test/MachineVerifier/test_g_load.mir
(which is just a copy-and-paste from the test file itself) to test this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it fail when you run ninja check-llvm
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you look at the CHECK statements, this test is expecting bad machine code failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it can be useful to do ./build/bin/llvm-lit -v llvm/test/MachineVerifier/test_g_load.mir
to see whether a specific test is failing/passing and why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it can be useful to do ./build/bin/llvm-lit -v llvm/test/MachineVerifier/test_g_load.mir to see whether a specific test is failing/passing and why.
this command is passing!
Does it fail when you run ninja check-llvm?
I tried to run it but every time I use check-llvm
it puts a lot of burden and my machine just got stuck there or even freezed.
If you look at the CHECK statements, this test is expecting bad machine code failures.
right, and I thought there would be a try-catch mechanism to catch the expected failures to report "there are expected failures" instead of actually raising them.
But anyhow, llvm-lit
is working, thanks!
d8f1211
to
f588854
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add some test cases where the llvm IR load instruction has an align
argument? Align argument could be smaller/equal/larger than element size or smaller/equal/larger than minimum total vector size (element size * min elts).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we test load of vector of pointer type?
ret <vscale x 8 x i64> %va | ||
} | ||
|
||
define <vscale x 1 x i8> @vload_nx1i8_align2(ptr %pa) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure we have the intended coverage on align.
Align argument could be smaller/equal/larger than element size or smaller/equal/larger than minimum total vector size
For i8 vector types we have test for align 2, align 8, align 32. Recall that align is in bytes but the type i8 is in bits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, I'll fix it!
I think now the test cases satisfy "align argument could be smaller/equal/larger than element size or smaller/equal/larger than minimum total vector size". But I wasn't sure if I'm correctly loading a vector of pointers. |
@@ -880,3 +880,25 @@ define <vscale x 2 x i64> @vload_nx2i64_align32(ptr %pa) { | |||
ret <vscale x 2 x i64> %va | |||
} | |||
|
|||
define <vscale x 8 x i64*> @vload_nx8i64_ptrs(ptr %pa) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vscale x 8 x ptr
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with minor comment.
llvm/lib/CodeGen/MachineVerifier.cpp
Outdated
@@ -1198,7 +1198,8 @@ void MachineVerifier::verifyPreISelGenericInstruction(const MachineInstr *MI) { | |||
if (MMO.getSizeInBits() >= ValTy.getSizeInBits()) | |||
report("Generic extload must have a narrower memory type", MI); | |||
} else if (MI->getOpcode() == TargetOpcode::G_LOAD) { | |||
if (MMO.getSize() > ValTy.getSizeInBytes()) | |||
if (TypeSize::isKnownGT(MMO.getType().getSizeInBytes(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verifier test for this would be good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it true that verifier tests for this aims to test run-time attributes? I tried to write some tests, such as load
ing <vscale x 1 x i16>
into <vscale x 1 x i8>
but they were all caught as compile-time errors like type mismatch. So it seems like this test requires a clever way to get around, do you have any suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This diff worked for me:
diff --git a/llvm/lib/CodeGen/MIRParser/MIParser.cpp b/llvm/lib/CodeGen/MIRParser/MIParser.cpp
index d41fc97cb806..41dbbd1b1413 100644
--- a/llvm/lib/CodeGen/MIRParser/MIParser.cpp
+++ b/llvm/lib/CodeGen/MIRParser/MIParser.cpp
@@ -3412,7 +3412,7 @@ bool MIParser::parseMachineMemoryOperand(MachineMemOperand *&Dest) {
if (expectAndConsume(MIToken::rparen))
return true;
- Size = MemoryType.getSizeInBytes();
+ Size = MemoryType.getSizeInBytes().getKnownMinValue();
}
MachinePointerInfo Ptr = MachinePointerInfo();
diff --git a/llvm/test/MachineVerifier/test_g_load.mir b/llvm/test/MachineVerifier/test_g_load.mir
index 07c3c0a6b5a2..0ffdf8e8efed 100644
--- a/llvm/test/MachineVerifier/test_g_load.mir
+++ b/llvm/test/MachineVerifier/test_g_load.mir
@@ -26,4 +26,9 @@ body: |
; CHECK: Bad machine code: atomic load cannot use release ordering
%5:_(s32) = G_LOAD %2 :: (load acq_rel (s32))
+ ; CHECK: Bad machine code: load memory size cannot exceed result size
+ %6:_(<vscale x 2 x s8>) = G_LOAD %2 :: (load (<vscale x 1 x s32>))
+
+ ; CHECK: Bad machine code: load memory size cannot exceed result size
+ %7:_(<vscale x 1 x s8>) = G_LOAD %2 :: (load (<vscale x 2 x s8>))
...
Maybe @harviniriawan and @davemgreen can weigh in on the proposed change in parseMachineMemoryOperand
because this PR seems like it may be relevant.
Can you please update the PR description? |
Sure, done! Can we merge it? Seems like there is a failing build but it's totally irrelevant to this patch and I didn't touch that at all. |
Can we wait for an opinion from the people who are working on or giving review to the scalable MMO patch I linked in my comment about the MachineVerifier tests? I think that we could probably add the changes I suggested in that comment thread. It would be nice to have those tests changes in this commit. |
…aller/equal/larger than the total vector size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Add IRTranslator for scalable vector load instruction and include corresponding tests with alignment argument included, which can be smaller/equal/larger than element size or smaller/equal/larger than the minimum total vector size.
Add IRTranslator for scalable vector load instruction and include corresponding tests with alignment argument included, which can be smaller/equal/larger than element size or smaller/equal/larger than the minimum total vector size.