Skip to content

Add support for flag output operand "=@cc" for SystemZ. #125970

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions clang/include/clang/Basic/TargetInfo.h
Original file line number Diff line number Diff line change
Expand Up @@ -1228,6 +1228,11 @@ class TargetInfo : public TransferrableTargetInfo,
std::string &/*SuggestedModifier*/) const {
return true;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be here.

// CC is binary on most targets. SystemZ overrides it as CC interval is
// [0, 4).
virtual unsigned getFlagOutputCCUpperBound() const { return 2; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this go into ConstraintInfo instead of a new hook?


virtual bool
validateAsmConstraint(const char *&Name,
TargetInfo::ConstraintInfo &info) const = 0;
Expand Down
11 changes: 11 additions & 0 deletions clang/lib/Basic/Targets/SystemZ.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,14 @@ bool SystemZTargetInfo::validateAsmConstraint(
case 'T': // Likewise, plus an index
Info.setAllowsMemory();
return true;
case '@':
// CC condition changes.
if (!StringRef("@cc").compare(Name)) {
Name += 2;
Info.setAllowsRegister();
return true;
}
return false;
}
}

Expand Down Expand Up @@ -160,6 +168,9 @@ unsigned SystemZTargetInfo::getMinGlobalAlign(uint64_t Size,

void SystemZTargetInfo::getTargetDefines(const LangOptions &Opts,
MacroBuilder &Builder) const {
// Inline assembly supports SystemZ flag outputs.
Builder.defineMacro("__GCC_ASM_FLAG_OUTPUTS__");

Builder.defineMacro("__s390__");
Builder.defineMacro("__s390x__");
Builder.defineMacro("__zarch__");
Expand Down
9 changes: 9 additions & 0 deletions clang/lib/Basic/Targets/SystemZ.h
Original file line number Diff line number Diff line change
Expand Up @@ -115,10 +115,19 @@ class LLVM_LIBRARY_VISIBILITY SystemZTargetInfo : public TargetInfo {
return RegName == "r15";
}

// CC has interval [0, 4).
unsigned getFlagOutputCCUpperBound() const override { return 4; }
bool validateAsmConstraint(const char *&Name,
TargetInfo::ConstraintInfo &info) const override;

std::string convertConstraint(const char *&Constraint) const override {
if (llvm::StringRef(Constraint).starts_with("@cc")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we trying to support different constraint strings here? On our platform, it should be enough to check for exact match against "@cc", right?

auto Len = llvm::StringRef("@cc").size();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a compile-time constant. Again, in SystemZ.cpp that is hard-coded; why do we need this expression here?

std::string Converted =
std::string("{") + std::string(Constraint, Len) + std::string("}");
Constraint += Len - 1;
return Converted;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also should have a case '@': in the switch statement below and move this check there.

switch (Constraint[0]) {
case 'p': // Keep 'p' constraint.
return std::string("p");
Expand Down
10 changes: 8 additions & 2 deletions clang/lib/CodeGen/CGStmt.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2621,9 +2621,15 @@ EmitAsmStores(CodeGenFunction &CGF, const AsmStmt &S,
if ((i < ResultRegIsFlagReg.size()) && ResultRegIsFlagReg[i]) {
// Target must guarantee the Value `Tmp` here is lowered to a boolean
// value.
llvm::Constant *Two = llvm::ConstantInt::get(Tmp->getType(), 2);
// Lowering 'Tmp' as - 'icmp ult %Tmp , CCUpperBound'. On some targets
// CCUpperBound is not binary. CCUpperBound is 4 for SystemZ,
// interval [0, 4). With this range known, llvm.assume intrinsic guides
// optimizer to generate more optimized IR. Verified it for SystemZ.
unsigned CCUpperBound = CGF.getTarget().getFlagOutputCCUpperBound();
llvm::Constant *CCUpperBoundConst =
llvm::ConstantInt::get(Tmp->getType(), CCUpperBound);
llvm::Value *IsBooleanValue =
Builder.CreateCmp(llvm::CmpInst::ICMP_ULT, Tmp, Two);
Builder.CreateCmp(llvm::CmpInst::ICMP_ULT, Tmp, CCUpperBoundConst);
llvm::Function *FnAssume = CGM.getIntrinsic(llvm::Intrinsic::assume);
Builder.CreateCall(FnAssume, IsBooleanValue);
}
Expand Down
33 changes: 33 additions & 0 deletions clang/test/CodeGen/inline-asm-systemz-flag-output.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
// RUN: %clang_cc1 -O2 -triple s390x-linux -emit-llvm -o - %s | FileCheck %s

int foo_012(int x) {
// CHECK-LABEL: @foo_012
// CHECK: = tail call { i32, i32 } asm "ahi $0,42\0A", "=d,={@cc},0"(i32 %x)
int cc;
asm ("ahi %[x],42\n" : [x] "+d"(x), "=@cc" (cc));
return cc == 0 || cc == 1 || cc == 2 ? 42 : 0;
}

int foo_013(int x) {
// CHECK-LABEL: @foo_013
// CHECK: = tail call { i32, i32 } asm "ahi $0,42\0A", "=d,={@cc},0"(i32 %x)
int cc;
asm ("ahi %[x],42\n" : [x] "+d"(x), "=@cc" (cc));
return cc == 0 || cc == 1 || cc == 3 ? 42 : 0;
}

int foo_023(int x) {
// CHECK-LABEL: @foo_023
// CHECK: = tail call { i32, i32 } asm "ahi $0,42\0A", "=d,={@cc},0"(i32 %x)
int cc;
asm ("ahi %[x],42\n" : [x] "+d"(x), "=@cc" (cc));
return cc == 0 || cc == 2 || cc == 3 ? 42 : 0;
}

int foo_123(int x) {
// CHECK-LABEL: @foo_123
// CHECK: = tail call { i32, i32 } asm "ahi $0,42\0A", "=d,={@cc},0"(i32 %x)
int cc;
asm ("ahi %[x],42\n" : [x] "+d"(x), "=@cc" (cc));
return cc == 1 || cc == 2 || cc == 3 ? 42 : 0;
}
4 changes: 4 additions & 0 deletions clang/test/Preprocessor/systemz_asm_flag_outut.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
// RUN: %clang -target systemz-unknown-unknown -x c -E -dM -o - %s | FileCheck -match-full-lines %s
// RUN: %clang -target s390x-unknown-unknown -x c -E -dM -o - %s | FileCheck -match-full-lines %s

// CHECK: #define __GCC_ASM_FLAG_OUTPUTS__ 1
3 changes: 3 additions & 0 deletions llvm/include/llvm/CodeGen/TargetLowering.h
Original file line number Diff line number Diff line change
Expand Up @@ -5107,6 +5107,9 @@ class TargetLowering : public TargetLoweringBase {
std::vector<SDValue> &Ops,
SelectionDAG &DAG) const;

// Lower switch statement for flag output operand with SRL/IPM Sequence.
virtual bool canLowerSRL_IPM_Switch(SDValue Cond) const;

// Lower custom output constraints. If invalid, return SDValue().
virtual SDValue LowerAsmOutputForConstraint(SDValue &Chain, SDValue &Glue,
const SDLoc &DL,
Expand Down
70 changes: 61 additions & 9 deletions llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2837,8 +2837,37 @@ void SelectionDAGBuilder::visitBr(const BranchInst &I) {
Opcode = Instruction::And;
else if (match(BOp, m_LogicalOr(m_Value(BOp0), m_Value(BOp1))))
Opcode = Instruction::Or;

if (Opcode &&
auto &TLI = DAG.getTargetLoweringInfo();
bool BrSrlIPM = FuncInfo.MF->getTarget().getTargetTriple().getArch() ==
Triple::ArchType::systemz;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really shouldn't have triple checks here in common SelectionDAG code. If absolutely needed, this should be abstracted behind appropriate target callbacks.

However, I'm wondering if this is indeed needed at all at this point. Can't we just let common code canonicalize the sequence of if statements into a switch statement, and then recognize the particular form of switch (with input coming from an IPM, and only two different switch targets) in platform-specific DAGCombine and directly translate it into a ccmask branch?

// For Flag output operands SRL/IPM sequence, we want to avoid
// creating switch case, as it creates Basic Block and inhibits
// optimization in DAGCombiner for flag output operands.
const auto checkSRLIPM = [&TLI](const SDValue &Op) {
if (!Op.getNumOperands())
return false;
SDValue OpVal = Op.getOperand(0);
SDNode *N = OpVal.getNode();
if (N && N->getOpcode() == ISD::SRL)
return TLI.canLowerSRL_IPM_Switch(OpVal);
else if (N && OpVal.getNumOperands() &&
(N->getOpcode() == ISD::AND || N->getOpcode() == ISD::OR)) {
SDValue OpVal1 = OpVal.getOperand(0);
SDNode *N1 = OpVal1.getNode();
if (N1 && N1->getOpcode() == ISD::SRL)
return TLI.canLowerSRL_IPM_Switch(OpVal1);
}
return false;
};
if (BrSrlIPM) {
if (NodeMap.count(BOp0) && NodeMap[BOp0].getNode()) {
BrSrlIPM &= checkSRLIPM(getValue(BOp0));
if (NodeMap.count(BOp1) && NodeMap[BOp1].getNode())
BrSrlIPM &= checkSRLIPM(getValue(BOp1));
} else
BrSrlIPM = false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already better than a target check, but there's still a whole lot of implicitly target-specific code here. There really shouldn't be a generic callback canLowerSRL_IPM_Switch - that even explicitly refers to SystemZ instruction names! If there's target-specific behavior needed here, this should be better abstracted.

Note that I see there's already a target hook to guide whether or not this transformation should be performed: the getJumpConditionMergingParams callback that provides input to the shouldKeepJumpConditionsTogether. I think you should investigate whether we can create a SystemZ-specific implementation of that callback that has the desired effect of inhibiting this transformation in the cases we care about. That should then work without any common-code change to this function.

if (Opcode && !BrSrlIPM &&
!(match(BOp0, m_ExtractElt(m_Value(Vec), m_Value())) &&
match(BOp1, m_ExtractElt(m_Specific(Vec), m_Value()))) &&
!shouldKeepJumpConditionsTogether(
Expand Down Expand Up @@ -12112,18 +12141,41 @@ void SelectionDAGBuilder::lowerWorkItem(SwitchWorkListItem W, Value *Cond,
const APInt &SmallValue = Small.Low->getValue();
const APInt &BigValue = Big.Low->getValue();

// Creating switch cases optimizing tranformation inhibits DAGCombiner
// for SystemZ for flag output operands. DAGCobiner compute cumulative
// CCMask for flag output operands SRL/IPM sequence, we want to avoid
// creating switch case, as it creates Basic Block and inhibits
// optimization in DAGCombiner for flag output operands.
// cases like (CC == 0) || (CC == 2) || (CC == 3), or
// (CC == 0) || (CC == 1) ^ (CC == 3), there could potentially be
// more cases like this.
const TargetLowering &TLI = DAG.getTargetLoweringInfo();
bool IsSrlIPM = false;
if (NodeMap.count(Cond) && NodeMap[Cond].getNode())
IsSrlIPM = CurMF->getTarget().getTargetTriple().getArch() ==
Triple::ArchType::systemz &&
TLI.canLowerSRL_IPM_Switch(getValue(Cond));
// Check that there is only one bit different.
APInt CommonBit = BigValue ^ SmallValue;
if (CommonBit.isPowerOf2()) {
if (CommonBit.isPowerOf2() || IsSrlIPM) {
SDValue CondLHS = getValue(Cond);
EVT VT = CondLHS.getValueType();
SDLoc DL = getCurSDLoc();

SDValue Or = DAG.getNode(ISD::OR, DL, VT, CondLHS,
DAG.getConstant(CommonBit, DL, VT));
SDValue Cond = DAG.getSetCC(
DL, MVT::i1, Or, DAG.getConstant(BigValue | SmallValue, DL, VT),
ISD::SETEQ);
SDValue Cond;

if (CommonBit.isPowerOf2()) {
SDValue Or = DAG.getNode(ISD::OR, DL, VT, CondLHS,
DAG.getConstant(CommonBit, DL, VT));
Cond = DAG.getSetCC(DL, MVT::i1, Or,
DAG.getConstant(BigValue | SmallValue, DL, VT),
ISD::SETEQ);
} else if (IsSrlIPM && BigValue == 3 && SmallValue == 0) {
SDValue SetCC =
DAG.getSetCC(DL, MVT::i32, CondLHS,
DAG.getConstant(SmallValue, DL, VT), ISD::SETEQ);
Cond = DAG.getSetCC(DL, MVT::i32, SetCC,
DAG.getConstant(BigValue, DL, VT), ISD::SETEQ);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this very SystemZ specific optimization shouldn't really be here. Doesn't this just revert the decision to introduce a switch statement that was made above? Could this not handled either by the visitBr above via the getJumpConditionMergingParams callback; or else fully in SystemZ DAGCombine code?


// Update successor info.
// Both Small and Big will jump to Small.BB, so we sum up the
Expand Down
4 changes: 4 additions & 0 deletions llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5576,6 +5576,10 @@ const char *TargetLowering::LowerXConstraint(EVT ConstraintVT) const {
return nullptr;
}

bool TargetLowering::canLowerSRL_IPM_Switch(SDValue Cond) const {
return false;
}

SDValue TargetLowering::LowerAsmOutputForConstraint(
SDValue &Chain, SDValue &Glue, const SDLoc &DL,
const AsmOperandInfo &OpInfo, SelectionDAG &DAG) const {
Expand Down
Loading