-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[InstCombine] Try optimizing with knownbits which determined from Cond #91762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
09bc4ef
deeef25
eed9965
34ae5fe
285aeb7
bd20058
413ab0a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1809,6 +1809,198 @@ static Instruction *foldSelectICmpEq(SelectInst &SI, ICmpInst *ICI, | |
return nullptr; | ||
} | ||
|
||
// ICmpInst of SelectInst is not included in the calculation of KnownBits | ||
// so we are missing the opportunity to optimize the Value of the True or | ||
// False Condition via ICmpInst with KnownBits. | ||
// | ||
// Consider: | ||
// %or = or i32 %x, %y | ||
// %or0 = icmp eq i32 %or, 0 | ||
// %and = and i32 %x, %y | ||
// %cond = select i1 %or0, i32 %and, i32 %or | ||
// ret i32 %cond | ||
// | ||
// Expect: | ||
// %or = or i32 %x, %y | ||
// ret i32 %or | ||
// | ||
// We could know what bit was enabled for %x, %y by ICmpInst in SelectInst. | ||
static Instruction *foldSelectICmpBinOp(SelectInst &SI, ICmpInst *ICI, | ||
Value *CmpLHS, Value *CmpRHS, | ||
Value *TVal, Value *FVal, | ||
InstCombinerImpl &IC) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would think you could want to call this twice w/ There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. sorry, I didn't understood what is your point. would you explain more detail for newbie? as I know, swap of true and false need to swap compare also. so, we test "equal" compare changed to "not equal" for cases of each. and we also test exchange %x, %y. this is added tests with this commit:
if you explain what you considering then it very helpful for me to improve knowledge of llvm. |
||
Value *X, *Y; | ||
const APInt *C; | ||
unsigned CmpLHSOpc; | ||
bool IsDisjoint = false; | ||
// Specially handling for X^Y==0 transformed to X==Y | ||
if (match(TVal, m_c_BitwiseLogic(m_Specific(CmpLHS), m_Specific(CmpRHS)))) { | ||
X = CmpLHS; | ||
Y = CmpRHS; | ||
APInt ZeroVal = APInt::getZero(CmpLHS->getType()->getScalarSizeInBits()); | ||
C = const_cast<APInt *>(&ZeroVal); | ||
CmpLHSOpc = Instruction::Xor; | ||
} else if ((match(CmpLHS, m_BinOp(m_Value(X), m_Value(Y))) && | ||
match(CmpRHS, m_APInt(C))) && | ||
(match(TVal, m_c_BinOp(m_Specific(X), m_Value())) || | ||
match(TVal, m_c_BinOp(m_Specific(Y), m_Value())))) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. afaict, these should only be matching bitwise logic |
||
if (auto Inst = dyn_cast<PossiblyDisjointInst>(CmpLHS)) { | ||
if (Inst->isDisjoint()) | ||
IsDisjoint = true; | ||
CmpLHSOpc = Instruction::Or; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you at the very least assert that this is correct. not an issue now, but we may just |
||
} else | ||
CmpLHSOpc = cast<BinaryOperator>(CmpLHS)->getOpcode(); | ||
} else | ||
return nullptr; | ||
|
||
enum SpecialKnownBits { | ||
NothingSpecial = 0, | ||
NoCommonBits = 1 << 1, | ||
AllCommonBits = 1 << 2, | ||
AllBitsEnabled = 1 << 3, | ||
}; | ||
|
||
// We cannot know exactly what bits is known in X Y. | ||
// Instead, we just know what relationship exist for. | ||
auto isSpecialKnownBitsFor = [&]() -> unsigned { | ||
if (CmpLHSOpc == Instruction::And) { | ||
if (C->isZero()) | ||
return NoCommonBits; | ||
} else if (CmpLHSOpc == Instruction::Xor) { | ||
if (C->isAllOnes()) | ||
return NoCommonBits | AllBitsEnabled; | ||
if (C->isZero()) | ||
return AllCommonBits; | ||
} else if (CmpLHSOpc == Instruction::Or && IsDisjoint) { | ||
if (C->isAllOnes()) | ||
return NoCommonBits | AllBitsEnabled; | ||
return NoCommonBits; | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
|
||
return NothingSpecial; | ||
}; | ||
|
||
auto hasOperandAt = [&](Instruction *I, Value *Op) -> int { | ||
for (unsigned Idx = 0; Idx < I->getNumOperands(); Idx++) { | ||
if (I->getOperand(Idx) == Op) | ||
return Idx + 1; | ||
} | ||
return 0; | ||
}; | ||
|
||
Type *TValTy = TVal->getType(); | ||
unsigned BitWidth = TVal->getType()->getScalarSizeInBits(); | ||
auto TValBop = cast<BinaryOperator>(TVal); | ||
unsigned XOrder = hasOperandAt(TValBop, X); | ||
unsigned YOrder = hasOperandAt(TValBop, Y); | ||
unsigned SKB = isSpecialKnownBitsFor(); | ||
|
||
KnownBits Known; | ||
if (TValBop->isBitwiseLogicOp()) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since you only support BitWise, can you just early return if not bitwise to reduce the level of nested ifs. |
||
// We handle if we know specific knownbits from cond of selectinst. | ||
// ex) X&Y==-1 ? X^Y : False | ||
if (SKB != SpecialKnownBits::NothingSpecial && XOrder && YOrder) { | ||
// No common bits between X, Y | ||
if (SKB & SpecialKnownBits::NoCommonBits) { | ||
if (SKB & (SpecialKnownBits::AllBitsEnabled)) { | ||
// If X op Y == -1, then XOR must be -1 | ||
if (TValBop->getOpcode() == Instruction::Xor) | ||
Known = KnownBits::makeConstant(APInt(BitWidth, -1)); | ||
} | ||
// If Trueval is X&Y then it should be 0. | ||
if (TValBop->getOpcode() == Instruction::And) | ||
Known = KnownBits::makeConstant(APInt(BitWidth, 0)); | ||
// X|Y can be replace with X^Y, X^Y can be replace with X|Y | ||
// This replacing is meaningful when falseval is same. | ||
else if ((match(TVal, m_c_Or(m_Specific(X), m_Specific(Y))) && | ||
match(FVal, m_c_Xor(m_Specific(X), m_Specific(Y)))) || | ||
(match(TVal, m_c_Xor(m_Specific(X), m_Specific(Y))) && | ||
match(FVal, m_c_Or(m_Specific(X), m_Specific(Y))))) | ||
return IC.replaceInstUsesWith(SI, FVal); | ||
// All common bits between X, Y | ||
} else if (SKB & SpecialKnownBits::AllCommonBits) { | ||
// We can replace (X&Y) and (X|Y) to X or Y | ||
if (TValBop->getOpcode() == Instruction::And || | ||
TValBop->getOpcode() == Instruction::Or) | ||
if (TValBop->hasOneUse()) | ||
return IC.replaceOperand(SI, 1, X); | ||
} else if (SKB & SpecialKnownBits::AllBitsEnabled) { | ||
// We can replace (X|Y) to -1 | ||
if (TValBop->getOpcode() == Instruction::Or) | ||
Known = KnownBits::makeConstant(APInt(BitWidth, -1)); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would organize this code a bit differently, instead of going through the knownbits cases and then applying rules to ops, I would Think that will be easier to follow/extend. |
||
} | ||
} else { | ||
KnownBits XKnown, YKnown, Temp; | ||
KnownBits TValBop0KB, TValBop1KB; | ||
// computeKnowBits calculates the KnownBits in the branching condition | ||
// that the specified variable passes in the execution flow. however, it | ||
// does not contain the SelectInst condition, so there is an optimization | ||
// opportunity to update the knownbits obtained by calculating KnownBits | ||
// with the SelectInst condition. | ||
XKnown = IC.computeKnownBits(X, 0, &SI); | ||
IC.computeKnownBitsFromCond(X, ICI, XKnown, 0, &SI, false); | ||
YKnown = IC.computeKnownBits(Y, 0, &SI); | ||
IC.computeKnownBitsFromCond(Y, ICI, YKnown, 0, &SI, false); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. comment constants here and above. |
||
CmpInst::Predicate Pred = ICI->getPredicate(); | ||
if (Pred == ICmpInst::ICMP_EQ) { | ||
// Estimate additional KnownBits from the relationship between X and Y | ||
if (CmpLHSOpc == Instruction::And) { | ||
// The bit that are set to 1 at `~C&Y` must be 0 in X | ||
// The bit that are set to 1 at `~C&X` must be 0 in Y | ||
XKnown.Zero |= ~*C & YKnown.One; | ||
YKnown.Zero |= ~*C & XKnown.One; | ||
} | ||
if (CmpLHSOpc == Instruction::Or) { | ||
// The bit that are set to 0 at `C&Y` must be 1 in X | ||
// The bit that are set to 0 at `C&X` must be 1 in Y | ||
XKnown.One |= *C & YKnown.Zero; | ||
YKnown.One |= *C & XKnown.Zero; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Think we should be handling the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Likewise for the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Okay, we can't really do that w/ current infra. Although I think a better way to do this would be to create a new helper in ValueTracking for computing known x/y under the assumption of a edit: The difference from existing APIs is that we have knownx/knowny There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. actually, these discovering way of knownbits is exist in valuetracker before but it removed more recently. but I think it can be used for this commit. so, I'm not have confidence about creating new api for this in ValueTracker. but it is beginner's opinion, I'll update code soon. |
||
} | ||
if (CmpLHSOpc == Instruction::Xor) { | ||
// If X^Y==C, then X and Y must be either (1,0) or (0,1) for the | ||
// enabled bits in C. | ||
XKnown.One |= *C & YKnown.Zero; | ||
XKnown.Zero |= *C & YKnown.One; | ||
YKnown.One |= *C & XKnown.Zero; | ||
YKnown.Zero |= *C & XKnown.One; | ||
// If X^Y==C, then X and Y must be either (0,0) or (1,1) for the | ||
// disabled bits in C. | ||
XKnown.Zero |= ~*C & YKnown.Zero; | ||
XKnown.One |= ~*C & YKnown.One; | ||
YKnown.Zero |= ~*C & XKnown.Zero; | ||
YKnown.One |= ~*C & XKnown.One; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I really don't understand what you are going for here... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. when we handle selectInst which have cond as explain, b. XKnown.Zero |= *C & YKnown.One; after we discovering it, we can use it usefully to trueval. this is one of test which contains in pull-request.
we have cond as so, We can treat X as 3 in trueval of this selectinst ,and it can possible replacing |
||
} | ||
} | ||
|
||
// If TrueVal has X or Y, return the corresponding KnownBits, otherwise | ||
// compute and return new KnownBits. | ||
auto getTValBopKB = [&](unsigned OpNum) -> KnownBits { | ||
unsigned Order = OpNum + 1; | ||
if (Order == XOrder) | ||
return XKnown; | ||
else if (Order == YOrder) | ||
return YKnown; | ||
|
||
Value *V = TValBop->getOperand(OpNum); | ||
KnownBits Known = IC.computeKnownBits(V, 0, &SI); | ||
return Known; | ||
}; | ||
TValBop0KB = getTValBopKB(0); | ||
TValBop1KB = getTValBopKB(1); | ||
Known = analyzeKnownBitsFromAndXorOr( | ||
cast<Operator>(TValBop), TValBop0KB, TValBop1KB, 0, | ||
IC.getSimplifyQuery().getWithInstruction(&SI)); | ||
} | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is a desperate need for comments in the above code. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll update soon! |
||
|
||
if (Known.isConstant()) { | ||
auto Const = ConstantInt::get(TValTy, Known.getConstant()); | ||
return IC.replaceOperand(SI, 1, Const); | ||
} | ||
|
||
return nullptr; | ||
} | ||
|
||
/// Visit a SelectInst that has an ICmpInst as its first operand. | ||
Instruction *InstCombinerImpl::foldSelectInstWithICmp(SelectInst &SI, | ||
ICmpInst *ICI) { | ||
|
@@ -1951,6 +2143,10 @@ Instruction *InstCombinerImpl::foldSelectInstWithICmp(SelectInst &SI, | |
if (Value *V = foldAbsDiff(ICI, TrueVal, FalseVal, Builder)) | ||
return replaceInstUsesWith(SI, V); | ||
|
||
if (Instruction *NewSel = foldSelectICmpBinOp(SI, ICI, CmpLHS, CmpRHS, | ||
TrueVal, FalseVal, *this)) | ||
return NewSel; | ||
|
||
return Changed ? &SI : nullptr; | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Analysis/ValueTracking.cpp#L1046?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, same way to use computeKnownBitsFromCond, but computeKnownBitsFromOperator would compute KnownBits of SelectInst. I use Cond from SelectInst to estimate the KnownBits of the variables that make up Cond. The key is that if we can use it to estimate the KnownBits of Trueval, there is a difference in performing the optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see what you mean. This seems very similar to #88298, I think generally this code would fit better as an extension to foldSelectValueEquivilence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still feel like the code fits better in
foldSelectValueEquivilence
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@goldsteinn
I changed it to call inside foldSelectValueEquivalence.
The order is changed so that the optimizations I added are performed first, preventing the previously performed optimizations from being performed. If this happens, should I change the order of the other optimization?