-
Notifications
You must be signed in to change notification settings - Fork 13.6k
clastb representation in existing IR, and AArch64 codegen #112738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
5807025
793989c
2ae4f71
4d30b3a
15c4974
5d74593
6e375c4
40eb936
997ae1a
cf5b283
63fae17
e0dfa15
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3379,6 +3379,20 @@ let Predicates = [HasSVE_or_SME] in { | |
def : Pat<(i64 (vector_extract nxv2i64:$vec, VectorIndexD:$index)), | ||
(UMOVvi64 (v2i64 (EXTRACT_SUBREG ZPR:$vec, zsub)), VectorIndexD:$index)>; | ||
|
||
// Find index of last active lane. This is a fallback in case we miss the | ||
// opportunity to fold into a lastb or clastb directly. | ||
Comment on lines
+3382
to
+3383
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are these fallback patterns tested in the final patch? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree it would be good to have some tests for these. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sadly it's pretty difficult to do this once the combines have been added. I don't see a global switch to disable combining, just the target-indepedent combines. We do check the optimization level in a few places in AArch64ISelLowering, but mostly for TLI methods for IR-level decisions. Deliberately turning off (c)lastb pattern matching at O0 feels odd. Adding a new switch just for this feature also feels excessive. I could potentially add a globalisel-based test, though I'm not sure how much code that requires. We've added a few new ISD nodes recently, and none have added support in globalisel. I guess this is mostly due to it being hard to just create a selectiondag without IR and run selection over it. |
||
def : Pat<(i64(find_last_active nxv16i1:$P1)), | ||
(INSERT_SUBREG(IMPLICIT_DEF), (LASTB_RPZ_B $P1, (INDEX_II_B 0, 1)), | ||
sub_32)>; | ||
def : Pat<(i64(find_last_active nxv8i1:$P1)), | ||
(INSERT_SUBREG(IMPLICIT_DEF), (LASTB_RPZ_H $P1, (INDEX_II_H 0, 1)), | ||
sub_32)>; | ||
def : Pat<(i64(find_last_active nxv4i1:$P1)), | ||
(INSERT_SUBREG(IMPLICIT_DEF), (LASTB_RPZ_S $P1, (INDEX_II_S 0, 1)), | ||
sub_32)>; | ||
def : Pat<(i64(find_last_active nxv2i1:$P1)), (LASTB_RPZ_D $P1, (INDEX_II_D 0, | ||
1))>; | ||
|
||
// Move element from the bottom 128-bits of a scalable vector to a single-element vector. | ||
// Alternative case where insertelement is just scalar_to_vector rather than vector_insert. | ||
def : Pat<(v1f64 (scalar_to_vector | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about f16 and bf16?