-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[clang] [ARM] Explicitly enable NEON for Windows/Darwin targets #122095
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -659,13 +659,21 @@ llvm::ARM::FPUKind arm::getARMTargetFeatures(const Driver &D, | |
CPUArgFPUKind != llvm::ARM::FK_INVALID ? CPUArgFPUKind : ArchArgFPUKind; | ||
(void)llvm::ARM::getFPUFeatures(FPUKind, Features); | ||
} else { | ||
bool Generic = true; | ||
if (!ForAS) { | ||
std::string CPU = arm::getARMTargetCPU(CPUName, ArchName, Triple); | ||
if (CPU != "generic") | ||
Generic = false; | ||
llvm::ARM::ArchKind ArchKind = | ||
arm::getLLVMArchKindForARM(CPU, ArchName, Triple); | ||
FPUKind = llvm::ARM::getDefaultFPU(CPU, ArchKind); | ||
(void)llvm::ARM::getFPUFeatures(FPUKind, Features); | ||
} | ||
if (Generic && (Triple.isOSWindows() || Triple.isOSDarwin()) && | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So the logic for the "generic" check is that Only if you're doing something strange with a CPU that's never had iOS or Windows run on it, only then does this code not apply. But given that it's a strange use case, it's best to leave it alone and just take the user's intent literally. Correct? Also I note that Android does not do this generic CPU check. I don't think making it do that should be part of this PR, and so, while I think this code is spaghetti, I wouldn't try to merge them here. Probably wouldn't break anything, but probably isn't zero chance. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Almost; So only for the cases where Yes, the fact that Android has its own check further above, means that it doesn't hit the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @mstorsjo, this breaks Darwin when compiling assembly with armv7s which is neon-vfp4. vfp4 instructions (e.g. vfma) are no longer recognized because you effectively downgraded neon-vfp4 to neon. When There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @anemet, sorry about this! While I did test some combinatons with I tried poking around for different ways of fixing this, but they all more or less amount to the fix as is currently being done in #130623, to remove the What do you (and @davemgreen, @DavidSpickett an @Stylie777) think here, what's the best course of action for the current 20.x release branch? Should we split out the change to remove the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't see folks being very happy about backporting anything that would add a potentially breaking change, even if it does fix another issue in the process. If you were to split out the ForAS change, would that be targeted enough to fix @anemet 's issue, or would that backport be just as impactful as #134366 ? (we can more likely justify a "worse" fix on 20.x given that main will get a comprehensive fix) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Did you try this as There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, sorry, that was a typo as I recreated the diff I used for testing (later I did find the one I actually did test with); yes I did use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No worries, thought that might be the case 😄 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So if the minimal fix in #134366 isn't really backportable, I'm not sure if I see many other reasonable ways to fix this breakage, so then we probably should revert this change for the 20.x branch. If we can land #130623 before 21.x (or at least the split out #134366) then both this and the armv7s/vfpv4 issue should be fixed for 21.x. And this PR in itself only attempted to fix things for downstream-patched Clang, so it's not strictly essential anyway. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Right, my mistake. I agree with the idea of reverting this on 20.x. |
||
getARMSubArchVersionNumber(Triple) >= 7) { | ||
FPUKind = llvm::ARM::parseFPU("neon"); | ||
(void)llvm::ARM::getFPUFeatures(FPUKind, Features); | ||
} | ||
} | ||
|
||
// Now we've finished accumulating features from arch, cpu and fpu, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, I do feel that this logic where we explicitly look for
"generic"
is a bit kludgey. It would be slightly neater to just stick this into the existing Android case above (at lines 650-655) - but then the default-neon case overrides arch/os default CPUs fromgetLLVMArchKindForARM
, which shows up as different output for%clang -target x86_64-apple-macosx10.10 -arch armv7s
inarm-target-features.c
below (where the default cpu forarmv7s
includes fp16 support).I'm open to opinions on what's the least ugly compromise here. :-)