Skip to content

Add support for flag output operand "=@cc" for SystemZ. #125970

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 12 additions & 1 deletion clang/include/clang/Basic/TargetInfo.h
Original file line number Diff line number Diff line change
Expand Up @@ -1114,10 +1114,12 @@ class TargetInfo : public TransferrableTargetInfo,

std::string ConstraintStr; // constraint: "=rm"
std::string Name; // Operand name: [foo] with no []'s.
unsigned FlagOutputCCUpperBound;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if we can re-use the existing range fields ImmRange here. Those are currently only used for input operands and require that only immediates within this range are used as input. For an output operand, this isn't useful - but instead we could take it to mean that the output is logically constrained to fall within that range, so we can add appropriate assertions.


public:
ConstraintInfo(StringRef ConstraintStr, StringRef Name)
: Flags(0), TiedOperand(-1), ConstraintStr(ConstraintStr.str()),
Name(Name.str()) {
Name(Name.str()), FlagOutputCCUpperBound(2) {
ImmRange.Min = ImmRange.Max = 0;
ImmRange.isConstrained = false;
}
Expand Down Expand Up @@ -1188,6 +1190,14 @@ class TargetInfo : public TransferrableTargetInfo,
TiedOperand = N;
// Don't copy Name or constraint string.
}

// CC range can be set by target. SystemZ sets it to 4. It is 2 by default.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment is wrong now (2 is no longer default). Also, it's probably not necessary to specifically call out SystemZ here.

void setFlagOutputCCUpperBound(unsigned CCBound) {
FlagOutputCCUpperBound = CCBound;
}
unsigned getFlagOutputCCUpperBound() const {
return FlagOutputCCUpperBound;
}
};

/// Validate register name used for global register variables.
Expand Down Expand Up @@ -1228,6 +1238,7 @@ class TargetInfo : public TransferrableTargetInfo,
std::string &/*SuggestedModifier*/) const {
return true;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be here.

virtual bool
validateAsmConstraint(const char *&Name,
TargetInfo::ConstraintInfo &info) const = 0;
Expand Down
12 changes: 12 additions & 0 deletions clang/lib/Basic/Targets/SystemZ.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,15 @@ bool SystemZTargetInfo::validateAsmConstraint(
case 'T': // Likewise, plus an index
Info.setAllowsMemory();
return true;
case '@':
// CC condition changes.
if (!StringRef("@cc").compare(Name)) {
Name += 2;
Info.setAllowsRegister();
Info.setFlagOutputCCUpperBound(4);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we should have the platform-specific comment explaining why this is 4.

return true;
}
return false;
}
}

Expand Down Expand Up @@ -160,6 +169,9 @@ unsigned SystemZTargetInfo::getMinGlobalAlign(uint64_t Size,

void SystemZTargetInfo::getTargetDefines(const LangOptions &Opts,
MacroBuilder &Builder) const {
// Inline assembly supports SystemZ flag outputs.
Builder.defineMacro("__GCC_ASM_FLAG_OUTPUTS__");

Builder.defineMacro("__s390__");
Builder.defineMacro("__s390x__");
Builder.defineMacro("__zarch__");
Expand Down
6 changes: 6 additions & 0 deletions clang/lib/Basic/Targets/SystemZ.h
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,12 @@ class LLVM_LIBRARY_VISIBILITY SystemZTargetInfo : public TargetInfo {
TargetInfo::ConstraintInfo &info) const override;

std::string convertConstraint(const char *&Constraint) const override {
if (llvm::StringRef(Constraint) == "@cc") {
auto Len = llvm::StringRef("@cc").size();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a compile-time constant. Again, in SystemZ.cpp that is hard-coded; why do we need this expression here?

std::string Converted = std::string("{@cc}");
Constraint += Len - 1;
return Converted;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also should have a case '@': in the switch statement below and move this check there.

switch (Constraint[0]) {
case 'p': // Keep 'p' constraint.
return std::string("p");
Expand Down
20 changes: 18 additions & 2 deletions clang/lib/CodeGen/CGStmt.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2621,9 +2621,25 @@ EmitAsmStores(CodeGenFunction &CGF, const AsmStmt &S,
if ((i < ResultRegIsFlagReg.size()) && ResultRegIsFlagReg[i]) {
// Target must guarantee the Value `Tmp` here is lowered to a boolean
// value.
llvm::Constant *Two = llvm::ConstantInt::get(Tmp->getType(), 2);
// Lowering 'Tmp' as - 'icmp ult %Tmp , CCUpperBound'. On some targets
// CCUpperBound is not binary. CCUpperBound is 4 for SystemZ,
// interval [0, 4). With this range known, llvm.assume intrinsic guides
// optimizer to generate more optimized IR in most of the cases as
// observed for select_cc on SystemZ unit tests for flag output operands.
// For some cases for br_cc, generated IR was weird. e.g. switch table
// for simple simple comparison terms for br_cc.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to explain that wrong code will result from an incorrect assertion.

StringRef Name;
if (const GCCAsmStmt *GAS = dyn_cast<GCCAsmStmt>(&S))
Name = GAS->getOutputName(i);
TargetInfo::ConstraintInfo Info(S.getOutputConstraint(i), Name);
bool IsValid = CGF.getTarget().validateOutputConstraint(Info);
(void)IsValid;
assert(IsValid && "Failed to parse flag output operand constraint");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this parsing was done in the caller of this routine (EmitAsmStmt) already - we shouldn't do that again here. I think instead of passing the ResultRegIsFlagReg array down into this routine, the caller should already compute the appropriate bounds and pass an array of those bounds into this function.

This might even allow us to remove the hard-coded llvm::StringRef(OutputConstraint).starts_with("{@cc") test in EmitAsmStmt and defer to the target the decision which output operands may be assumed to fall within a certain range of values.

unsigned CCUpperBound = Info.getFlagOutputCCUpperBound();
llvm::Constant *CCUpperBoundConst =
llvm::ConstantInt::get(Tmp->getType(), CCUpperBound);
llvm::Value *IsBooleanValue =
Builder.CreateCmp(llvm::CmpInst::ICMP_ULT, Tmp, Two);
Builder.CreateCmp(llvm::CmpInst::ICMP_ULT, Tmp, CCUpperBoundConst);
llvm::Function *FnAssume = CGM.getIntrinsic(llvm::Intrinsic::assume);
Builder.CreateCall(FnAssume, IsBooleanValue);
}
Expand Down
33 changes: 33 additions & 0 deletions clang/test/CodeGen/inline-asm-systemz-flag-output.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
// RUN: %clang_cc1 -O2 -triple s390x-linux -emit-llvm -o - %s | FileCheck %s

int foo_012(int x) {
// CHECK-LABEL: @foo_012
// CHECK: = tail call { i32, i32 } asm "ahi $0,42\0A", "=d,={@cc},0"(i32 %x)
int cc;
asm ("ahi %[x],42\n" : [x] "+d"(x), "=@cc" (cc));
return cc == 0 || cc == 1 || cc == 2 ? 42 : 0;
}

int foo_013(int x) {
// CHECK-LABEL: @foo_013
// CHECK: = tail call { i32, i32 } asm "ahi $0,42\0A", "=d,={@cc},0"(i32 %x)
int cc;
asm ("ahi %[x],42\n" : [x] "+d"(x), "=@cc" (cc));
return cc == 0 || cc == 1 || cc == 3 ? 42 : 0;
}

int foo_023(int x) {
// CHECK-LABEL: @foo_023
// CHECK: = tail call { i32, i32 } asm "ahi $0,42\0A", "=d,={@cc},0"(i32 %x)
int cc;
asm ("ahi %[x],42\n" : [x] "+d"(x), "=@cc" (cc));
return cc == 0 || cc == 2 || cc == 3 ? 42 : 0;
}

int foo_123(int x) {
// CHECK-LABEL: @foo_123
// CHECK: = tail call { i32, i32 } asm "ahi $0,42\0A", "=d,={@cc},0"(i32 %x)
int cc;
asm ("ahi %[x],42\n" : [x] "+d"(x), "=@cc" (cc));
return cc == 1 || cc == 2 || cc == 3 ? 42 : 0;
}
4 changes: 4 additions & 0 deletions clang/test/Preprocessor/systemz_asm_flag_outut.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
// RUN: %clang -target systemz-unknown-unknown -x c -E -dM -o - %s | FileCheck -match-full-lines %s
// RUN: %clang -target s390x-unknown-unknown -x c -E -dM -o - %s | FileCheck -match-full-lines %s

// CHECK: #define __GCC_ASM_FLAG_OUTPUTS__ 1
3 changes: 3 additions & 0 deletions llvm/include/llvm/CodeGen/TargetLowering.h
Original file line number Diff line number Diff line change
Expand Up @@ -5107,6 +5107,9 @@ class TargetLowering : public TargetLoweringBase {
std::vector<SDValue> &Ops,
SelectionDAG &DAG) const;

// Lower switch statement for flag output operand with SRL/IPM Sequence.
virtual bool canLowerSRL_IPM_Switch(SDValue Cond) const;

// Lower custom output constraints. If invalid, return SDValue().
virtual SDValue LowerAsmOutputForConstraint(SDValue &Chain, SDValue &Glue,
const SDLoc &DL,
Expand Down
62 changes: 53 additions & 9 deletions llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2837,8 +2837,34 @@ void SelectionDAGBuilder::visitBr(const BranchInst &I) {
Opcode = Instruction::And;
else if (match(BOp, m_LogicalOr(m_Value(BOp0), m_Value(BOp1))))
Opcode = Instruction::Or;

if (Opcode &&
auto &TLI = DAG.getTargetLoweringInfo();
const auto checkSRLIPM = [&TLI](const SDValue &Op) {
if (!Op.getNumOperands())
return false;
SDValue OpVal = Op.getOperand(0);
SDNode *N = OpVal.getNode();
if (N && N->getOpcode() == ISD::SRL)
return TLI.canLowerSRL_IPM_Switch(OpVal);
else if (N && OpVal.getNumOperands() &&
(N->getOpcode() == ISD::AND || N->getOpcode() == ISD::OR)) {
SDValue OpVal1 = OpVal.getOperand(0);
SDNode *N1 = OpVal1.getNode();
if (N1 && N1->getOpcode() == ISD::SRL)
return TLI.canLowerSRL_IPM_Switch(OpVal1);
}
return false;
};
// Incoming IR here is straight line code, FindMergedConditions splits
// condition code sequence across Basic Block. DAGCombiner can't combine
// across Basic Block. Identify SRL/IPM/CC sequence for SystemZ and avoid
// transformation in FindMergedConditions.
bool BrSrlIPM = false;
if (NodeMap.count(BOp0) && NodeMap[BOp0].getNode()) {
BrSrlIPM |= checkSRLIPM(getValue(BOp0));
if (NodeMap.count(BOp1) && NodeMap[BOp1].getNode())
BrSrlIPM &= checkSRLIPM(getValue(BOp1));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already better than a target check, but there's still a whole lot of implicitly target-specific code here. There really shouldn't be a generic callback canLowerSRL_IPM_Switch - that even explicitly refers to SystemZ instruction names! If there's target-specific behavior needed here, this should be better abstracted.

Note that I see there's already a target hook to guide whether or not this transformation should be performed: the getJumpConditionMergingParams callback that provides input to the shouldKeepJumpConditionsTogether. I think you should investigate whether we can create a SystemZ-specific implementation of that callback that has the desired effect of inhibiting this transformation in the cases we care about. That should then work without any common-code change to this function.

if (Opcode && !BrSrlIPM &&
!(match(BOp0, m_ExtractElt(m_Value(Vec), m_Value())) &&
match(BOp1, m_ExtractElt(m_Specific(Vec), m_Value()))) &&
!shouldKeepJumpConditionsTogether(
Expand Down Expand Up @@ -12112,18 +12138,36 @@ void SelectionDAGBuilder::lowerWorkItem(SwitchWorkListItem W, Value *Cond,
const APInt &SmallValue = Small.Low->getValue();
const APInt &BigValue = Big.Low->getValue();

// Incoming IR is switch table.Identify SRL/IPM/CC sequence for SystemZ
// and we want to avoid splitting condition code sequence across basic
// block for cases like (CC == 0) || (CC == 2) || (CC == 3), or
// (CC == 0) || (CC == 1) ^ (CC == 3), there could potentially be
// more cases like this.
const TargetLowering &TLI = DAG.getTargetLoweringInfo();
bool IsSrlIPM = false;
if (NodeMap.count(Cond) && NodeMap[Cond].getNode())
IsSrlIPM = TLI.canLowerSRL_IPM_Switch(getValue(Cond));
// Check that there is only one bit different.
APInt CommonBit = BigValue ^ SmallValue;
if (CommonBit.isPowerOf2()) {
if (CommonBit.isPowerOf2() || IsSrlIPM) {
SDValue CondLHS = getValue(Cond);
EVT VT = CondLHS.getValueType();
SDLoc DL = getCurSDLoc();

SDValue Or = DAG.getNode(ISD::OR, DL, VT, CondLHS,
DAG.getConstant(CommonBit, DL, VT));
SDValue Cond = DAG.getSetCC(
DL, MVT::i1, Or, DAG.getConstant(BigValue | SmallValue, DL, VT),
ISD::SETEQ);
SDValue Cond;

if (CommonBit.isPowerOf2()) {
SDValue Or = DAG.getNode(ISD::OR, DL, VT, CondLHS,
DAG.getConstant(CommonBit, DL, VT));
Cond = DAG.getSetCC(DL, MVT::i1, Or,
DAG.getConstant(BigValue | SmallValue, DL, VT),
ISD::SETEQ);
} else if (IsSrlIPM && BigValue == 3 && SmallValue == 0) {
SDValue SetCC =
DAG.getSetCC(DL, MVT::i32, CondLHS,
DAG.getConstant(SmallValue, DL, VT), ISD::SETEQ);
Cond = DAG.getSetCC(DL, MVT::i32, SetCC,
DAG.getConstant(BigValue, DL, VT), ISD::SETEQ);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this very SystemZ specific optimization shouldn't really be here. Doesn't this just revert the decision to introduce a switch statement that was made above? Could this not handled either by the visitBr above via the getJumpConditionMergingParams callback; or else fully in SystemZ DAGCombine code?


// Update successor info.
// Both Small and Big will jump to Small.BB, so we sum up the
Expand Down
4 changes: 4 additions & 0 deletions llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5576,6 +5576,10 @@ const char *TargetLowering::LowerXConstraint(EVT ConstraintVT) const {
return nullptr;
}

bool TargetLowering::canLowerSRL_IPM_Switch(SDValue Cond) const {
return false;
}

SDValue TargetLowering::LowerAsmOutputForConstraint(
SDValue &Chain, SDValue &Glue, const SDLoc &DL,
const AsmOperandInfo &OpInfo, SelectionDAG &DAG) const {
Expand Down
Loading