-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[Clang] Add wraps attribute (for granular integer overflow handling) #86618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-clang @llvm/pr-subscribers-clang-codegen Author: Justin Stitt (JustinStitt) ChangesIntroThis attribute would allow for more granular control over what expressions can emit integer overflow warnings or integer overflow sanitizer errors. Here are some examples:copy-pasted from my RFC. When compiling with typedef int __attribute__((wraps)) wrapping_int;
wrapping_int A = INT_MAX;
++A; /* This will not trigger a runtime sanitizer error */ unsigned long addr __attribute__((wraps)); /* wraps can be added to variables */
unsigned long size; /* both operands don't need wraps, just one */
...
if (addr + size < addr) {
// handle overflow gracefully, because the overflow sanitizer won't trip
}
... typedef int __attribute__((wraps)) wrapping_int;
int a = (wrapping_int) INT_MAX + 1; // no sanitizer trip (or -Winteger-overflow) The next few examples will show some cases where sanitizers will still trip (by design): wrapping_int A = INT_MAX;
int B = 1;
A = INT_MAX + 1; // overflow sanitizer trips! (INT_MAX + 1) is computed and trips the sanitizer before assignment to A
B = (INT_MAX+B) * A; // overflow sanitizer trips!
B = INT_MAX + B * A; // overflow sanitizer does NOT trip!
// This is because B * A is computed first, the result type carries the wraps attribute
// with it from A, then that result is used in summation with INT_MAX. ReasoningThe root cause of many security vulnerabilities in the Linux kernel is arithmetic overflow. The currently available choices to control arithmetic overflow behavior are very limited. The instrumentation is "all or nothing" which lacks the granularity required for many cases. Enabling or disabling a sanitizer like "if there's some unsigned wrap-around checker that doesn't This was in response to a patch that was trying to change the commonly accepted idiom: Recently, there's been some effort by Kees, myself and others to reintroduce the signed and unsigned integer overflow sanitizers in the Linux Kernel. Upon turning these sanitizers back on (or for the case of signed-integer-overflow, making it work at all) we encountered plently of existing instances of integer overflow in the Kernel. However, there's some pushback when trying to alter the "traditional" way of doing things. With this new wrapping attribute we can specifically craft types that disable overflow instrumentation, without modifying traditional and widely understood code patterns -- resulting in easier to read code. To summarize the behavior:Using Other Notes
CCs@kees @nickdesaulniers @bwendling @erichkeane @haoNoQ @apple-fcloutier @FlashSheridan Patch is 25.42 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/86618.diff 15 Files Affected:
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 7fbe2fec6ca065..20bb9815830592 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -248,6 +248,13 @@ Attribute Changes in Clang
added a new extension query ``__has_extension(swiftcc)`` corresponding to the
``__attribute__((swiftcc))`` attribute.
+- Introduced ``__attribute((wraps))`` or ``[[wraps]]`` which can be added to
+ type or variable declarations. Using an attributed type or variable in an
+ arithmetic expression will define the overflow behavior for that expression
+ as having two's complement wrap-around. These expressions cannot trigger
+ integer overflow warnings or sanitizer warnings. They also cannot be
+ optimized away by some eager UB optimizations.
+
Improvements to Clang's diagnostics
-----------------------------------
- Clang now applies syntax highlighting to the code snippets it
diff --git a/clang/include/clang/AST/Expr.h b/clang/include/clang/AST/Expr.h
index 6e153ebe024b42..934146e8a182bc 100644
--- a/clang/include/clang/AST/Expr.h
+++ b/clang/include/clang/AST/Expr.h
@@ -4084,6 +4084,9 @@ class BinaryOperator : public Expr {
static unsigned sizeOfTrailingObjects(bool HasFPFeatures) {
return HasFPFeatures * sizeof(FPOptionsOverride);
}
+
+ /// Do one of the subexpressions have the wraps attribute?
+ bool oneOfWraps(const ASTContext &Ctx) const;
};
/// CompoundAssignOperator - For compound assignments (e.g. +=), we keep
diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td
index 3e03e55612645b..0ea7755791d82e 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -4496,3 +4496,9 @@ def CodeAlign: StmtAttr {
static constexpr int MaximumAlignment = 4096;
}];
}
+
+def Wraps : DeclOrTypeAttr {
+ let Spellings = [Clang<"wraps">, CXX11<"", "wraps", 202403>];
+ let Subjects = SubjectList<[Var, TypedefName, Field]>;
+ let Documentation = [WrapsDocs];
+}
diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td
index 9de14f608fd114..af662702edcffa 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -8005,3 +8005,69 @@ requirement:
}
}];
}
+
+def WrapsDocs : Documentation {
+ let Category = DocCatField;
+ let Content = [{
+This attribute can be used with type or variable declarations to denote that
+arithmetic containing these marked components have defined overflow behavior.
+Specifically, the behavior is defined as being consistent with two's complement
+wrap-around. For the purposes of sanitizers or warnings that concern themselves
+with the definedness of integer arithmetic, they will cease to instrument or
+warn about arithmetic that directly involves a "wrapping" component.
+
+For example, ``-fsanitize=signed-integer-overflow`` or ``-Winteger-overflow``
+will not warn about suspicious overflowing arithmetic -- assuming correct usage
+of the wraps attribute.
+
+This example shows some basic usage of ``__attribute__((wraps))`` on a type
+definition when building with ``-fsanitize=signed-integer-overflow``
+
+.. code-block:: c
+ typedef int __attribute__((wraps)) wrapping_int;
+
+ void foo() {
+ wrapping_int a = INT_MAX;
+ ++a; // no sanitizer warning
+ }
+
+ int main() { foo(); }
+
+In the following example, we use ``__attribute__((wraps))`` on a variable to
+disable overflow instrumentation for arithmetic expressions it appears in. We
+do so with a popular overflow-checking pattern which we might not want to trip
+sanitizers (like ``-fsanitize=unsigned-integer-overflow``).
+
+.. code-block:: c
+ void foo(int offset) {
+ unsigned int A __attribute__((wraps)) = UINT_MAX;
+
+ // to check for overflow using this pattern, we may perform a real overflow
+ // thus triggering sanitizers to step in. Since A is "wrapping", we can be
+ // sure there are no sanitizer warnings.
+ if (A + offset < A) {
+ // handle overflow manually
+ // ...
+ return;
+ }
+
+ // now, handle non-overflow case
+ // ...
+ }
+
+The above example demonstrates some of the power and elegance this attribute
+provides. We can use code patterns we are already familiar with (like ``if (x +
+y < x)``) while gaining control over the overflow behavior on a case-by-case
+basis.
+
+When combined with ``-fwrapv``, this attribute can still be applied as normal
+but has no function apart from annotating types and variables for readers. This
+is because ``-fwrapv`` defines all arithmetic as being "wrapping", rending this
+attribute's efforts redundant.
+
+When using this attribute without ``-fwrapv`` and without any sanitizers, it
+still has an impact on the definedness of arithmetic expressions containing
+wrapping components. Since the behavior of said expressions is now technically
+defined, the compiler will forgo some eager optimizations that are used on
+expressions containing UB.}];
+}
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index fc727cef9cd835..c6e0fec9856cd4 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6534,6 +6534,9 @@ def err_counted_by_attr_refer_to_union : Error<
def note_flexible_array_counted_by_attr_field : Note<
"field %0 declared here">;
+def warn_wraps_attr_var_decl_type_not_integer : Warning<
+ "using attribute 'wraps' with non-integer type '%0' has no function">;
+
let CategoryName = "ARC Semantic Issue" in {
// ARC-mode diagnostics.
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index 5ecd2f9eb2881f..1989bbfe7045f0 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -3809,6 +3809,10 @@ class Sema final {
void AddAnnotationAttr(Decl *D, const AttributeCommonInfo &CI,
StringRef Annot, MutableArrayRef<Expr *> Args);
+ /// AddWrapsAttr - Adds the "wraps" attribute to a particular
+ /// declaration.
+ void AddWrapsAttr(Decl *D, const AttributeCommonInfo &CI);
+
bool checkMSInheritanceAttrOnDefinition(CXXRecordDecl *RD, SourceRange Range,
bool BestCase,
MSInheritanceModel SemanticSpelling);
diff --git a/clang/lib/AST/Expr.cpp b/clang/lib/AST/Expr.cpp
index 6221ebd5c9b4e9..a9c5f02ddd4093 100644
--- a/clang/lib/AST/Expr.cpp
+++ b/clang/lib/AST/Expr.cpp
@@ -2243,6 +2243,21 @@ bool BinaryOperator::isNullPointerArithmeticExtension(ASTContext &Ctx,
return true;
}
+bool BinaryOperator::oneOfWraps(const ASTContext &Ctx) const {
+ llvm::SmallVector<Expr *, 2> Both = {getLHS(), getRHS()};
+
+ for (const Expr *oneOf : Both) {
+ if (!oneOf)
+ continue;
+ if (auto *TypePtr =
+ oneOf->IgnoreParenImpCasts()->getType().getTypePtrOrNull())
+ if (TypePtr->hasAttr(attr::Wraps)) {
+ return true;
+ }
+ }
+ return false;
+}
+
SourceLocExpr::SourceLocExpr(const ASTContext &Ctx, SourceLocIdentKind Kind,
QualType ResultTy, SourceLocation BLoc,
SourceLocation RParenLoc,
@@ -4757,6 +4772,8 @@ BinaryOperator::BinaryOperator(const ASTContext &Ctx, Expr *lhs, Expr *rhs,
if (hasStoredFPFeatures())
setStoredFPFeatures(FPFeatures);
setDependence(computeDependence(this));
+ if (oneOfWraps(Ctx))
+ setType(Ctx.getAttributedType(attr::Wraps, getType(), getType()));
}
BinaryOperator::BinaryOperator(const ASTContext &Ctx, Expr *lhs, Expr *rhs,
@@ -4774,6 +4791,8 @@ BinaryOperator::BinaryOperator(const ASTContext &Ctx, Expr *lhs, Expr *rhs,
if (hasStoredFPFeatures())
setStoredFPFeatures(FPFeatures);
setDependence(computeDependence(this));
+ if (oneOfWraps(Ctx))
+ setType(Ctx.getAttributedType(attr::Wraps, getType(), getType()));
}
BinaryOperator *BinaryOperator::CreateEmpty(const ASTContext &C,
diff --git a/clang/lib/AST/ExprConstant.cpp b/clang/lib/AST/ExprConstant.cpp
index 592d43597dc1b4..29b7836c940a00 100644
--- a/clang/lib/AST/ExprConstant.cpp
+++ b/clang/lib/AST/ExprConstant.cpp
@@ -2775,7 +2775,8 @@ static bool CheckedIntArithmetic(EvalInfo &Info, const Expr *E,
APSInt Value(Op(LHS.extend(BitWidth), RHS.extend(BitWidth)), false);
Result = Value.trunc(LHS.getBitWidth());
if (Result.extend(BitWidth) != Value) {
- if (Info.checkingForUndefinedBehavior())
+ if (Info.checkingForUndefinedBehavior() &&
+ !E->getType().getTypePtr()->hasAttr(attr::Wraps))
Info.Ctx.getDiagnostics().Report(E->getExprLoc(),
diag::warn_integer_constant_overflow)
<< toString(Result, 10, Result.isSigned(), /*formatAsCLiteral=*/false,
@@ -13964,7 +13965,8 @@ bool IntExprEvaluator::VisitUnaryOperator(const UnaryOperator *E) {
if (!Result.isInt()) return Error(E);
const APSInt &Value = Result.getInt();
if (Value.isSigned() && Value.isMinSignedValue() && E->canOverflow()) {
- if (Info.checkingForUndefinedBehavior())
+ if (Info.checkingForUndefinedBehavior() &&
+ !E->getType().getTypePtr()->hasAttr(attr::Wraps))
Info.Ctx.getDiagnostics().Report(E->getExprLoc(),
diag::warn_integer_constant_overflow)
<< toString(Value, 10, Value.isSigned(), /*formatAsCLiteral=*/false,
diff --git a/clang/lib/AST/TypePrinter.cpp b/clang/lib/AST/TypePrinter.cpp
index 7032ff2f18468c..52042666cb82cc 100644
--- a/clang/lib/AST/TypePrinter.cpp
+++ b/clang/lib/AST/TypePrinter.cpp
@@ -1947,6 +1947,9 @@ void TypePrinter::printAttributedAfter(const AttributedType *T,
case attr::AArch64SVEPcs: OS << "aarch64_sve_pcs"; break;
case attr::AMDGPUKernelCall: OS << "amdgpu_kernel"; break;
case attr::IntelOclBicc: OS << "inteloclbicc"; break;
+ case attr::Wraps:
+ OS << "wraps";
+ break;
case attr::PreserveMost:
OS << "preserve_most";
break;
diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp
index 8536570087ad0f..7c6fe78454709f 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -146,6 +146,15 @@ struct BinOpInfo {
return UnOp->getSubExpr()->getType()->isFixedPointType();
return false;
}
+
+ /// Does the BinaryOperator have the wraps attribute?
+ /// If so, we can ellide overflow sanitizer checks.
+ bool oneOfWraps() const {
+ const Type *TyPtr = E->getType().getTypePtrOrNull();
+ if (TyPtr)
+ return TyPtr->hasAttr(attr::Wraps);
+ return false;
+ }
};
static bool MustVisitNullValue(const Expr *E) {
@@ -724,6 +733,11 @@ class ScalarExprEmitter
// Binary Operators.
Value *EmitMul(const BinOpInfo &Ops) {
+ if ((Ops.Ty->isSignedIntegerOrEnumerationType() ||
+ Ops.Ty->isUnsignedIntegerType()) &&
+ Ops.oneOfWraps())
+ return Builder.CreateMul(Ops.LHS, Ops.RHS, "mul");
+
if (Ops.Ty->isSignedIntegerOrEnumerationType()) {
switch (CGF.getLangOpts().getSignedOverflowBehavior()) {
case LangOptions::SOB_Defined:
@@ -2685,6 +2699,9 @@ ScalarExprEmitter::EmitScalarPrePostIncDec(const UnaryOperator *E, LValue LV,
} else if (type->isIntegerType()) {
QualType promotedType;
bool canPerformLossyDemotionCheck = false;
+ BinOpInfo Ops = (createBinOpInfoFromIncDec(
+ E, value, isInc, E->getFPFeaturesInEffect(CGF.getLangOpts())));
+
if (CGF.getContext().isPromotableIntegerType(type)) {
promotedType = CGF.getContext().getPromotedIntegerType(type);
assert(promotedType != type && "Shouldn't promote to the same type.");
@@ -2727,10 +2744,12 @@ ScalarExprEmitter::EmitScalarPrePostIncDec(const UnaryOperator *E, LValue LV,
// Note that signed integer inc/dec with width less than int can't
// overflow because of promotion rules; we're just eliding a few steps
// here.
- } else if (E->canOverflow() && type->isSignedIntegerOrEnumerationType()) {
+ } else if (E->canOverflow() && type->isSignedIntegerOrEnumerationType() &&
+ !Ops.oneOfWraps()) {
value = EmitIncDecConsiderOverflowBehavior(E, value, isInc);
} else if (E->canOverflow() && type->isUnsignedIntegerType() &&
- CGF.SanOpts.has(SanitizerKind::UnsignedIntegerOverflow)) {
+ CGF.SanOpts.has(SanitizerKind::UnsignedIntegerOverflow) &&
+ !Ops.oneOfWraps()) {
value = EmitOverflowCheckedBinOp(createBinOpInfoFromIncDec(
E, value, isInc, E->getFPFeaturesInEffect(CGF.getLangOpts())));
} else {
@@ -3504,7 +3523,8 @@ Value *ScalarExprEmitter::EmitDiv(const BinOpInfo &Ops) {
if ((CGF.SanOpts.has(SanitizerKind::IntegerDivideByZero) ||
CGF.SanOpts.has(SanitizerKind::SignedIntegerOverflow)) &&
Ops.Ty->isIntegerType() &&
- (Ops.mayHaveIntegerDivisionByZero() || Ops.mayHaveIntegerOverflow())) {
+ (Ops.mayHaveIntegerDivisionByZero() || Ops.mayHaveIntegerOverflow()) &&
+ !Ops.oneOfWraps()) {
llvm::Value *Zero = llvm::Constant::getNullValue(ConvertType(Ops.Ty));
EmitUndefinedBehaviorIntegerDivAndRemCheck(Ops, Zero, true);
} else if (CGF.SanOpts.has(SanitizerKind::FloatDivideByZero) &&
@@ -3553,7 +3573,8 @@ Value *ScalarExprEmitter::EmitRem(const BinOpInfo &Ops) {
if ((CGF.SanOpts.has(SanitizerKind::IntegerDivideByZero) ||
CGF.SanOpts.has(SanitizerKind::SignedIntegerOverflow)) &&
Ops.Ty->isIntegerType() &&
- (Ops.mayHaveIntegerDivisionByZero() || Ops.mayHaveIntegerOverflow())) {
+ (Ops.mayHaveIntegerDivisionByZero() || Ops.mayHaveIntegerOverflow()) &&
+ !Ops.oneOfWraps()) {
CodeGenFunction::SanitizerScope SanScope(&CGF);
llvm::Value *Zero = llvm::Constant::getNullValue(ConvertType(Ops.Ty));
EmitUndefinedBehaviorIntegerDivAndRemCheck(Ops, Zero, false);
@@ -3918,6 +3939,11 @@ Value *ScalarExprEmitter::EmitAdd(const BinOpInfo &op) {
op.RHS->getType()->isPointerTy())
return emitPointerArithmetic(CGF, op, CodeGenFunction::NotSubtraction);
+ if ((op.Ty->isSignedIntegerOrEnumerationType() ||
+ op.Ty->isUnsignedIntegerType()) &&
+ op.oneOfWraps())
+ return Builder.CreateAdd(op.LHS, op.RHS, "add");
+
if (op.Ty->isSignedIntegerOrEnumerationType()) {
switch (CGF.getLangOpts().getSignedOverflowBehavior()) {
case LangOptions::SOB_Defined:
@@ -4074,6 +4100,10 @@ Value *ScalarExprEmitter::EmitFixedPointBinOp(const BinOpInfo &op) {
Value *ScalarExprEmitter::EmitSub(const BinOpInfo &op) {
// The LHS is always a pointer if either side is.
if (!op.LHS->getType()->isPointerTy()) {
+ if ((op.Ty->isSignedIntegerOrEnumerationType() ||
+ op.Ty->isUnsignedIntegerType()) &&
+ op.oneOfWraps())
+ return Builder.CreateSub(op.LHS, op.RHS, "sub");
if (op.Ty->isSignedIntegerOrEnumerationType()) {
switch (CGF.getLangOpts().getSignedOverflowBehavior()) {
case LangOptions::SOB_Defined:
@@ -4224,7 +4254,7 @@ Value *ScalarExprEmitter::EmitShl(const BinOpInfo &Ops) {
bool SanitizeSignedBase = CGF.SanOpts.has(SanitizerKind::ShiftBase) &&
Ops.Ty->hasSignedIntegerRepresentation() &&
!CGF.getLangOpts().isSignedOverflowDefined() &&
- !CGF.getLangOpts().CPlusPlus20;
+ !CGF.getLangOpts().CPlusPlus20 && !Ops.oneOfWraps();
bool SanitizeUnsignedBase =
CGF.SanOpts.has(SanitizerKind::UnsignedShiftBase) &&
Ops.Ty->hasUnsignedIntegerRepresentation();
diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp
index 0a62c656d824ff..6bc8d017d2edaf 100644
--- a/clang/lib/Sema/SemaDeclAttr.cpp
+++ b/clang/lib/Sema/SemaDeclAttr.cpp
@@ -4413,6 +4413,14 @@ void Sema::AddAlignValueAttr(Decl *D, const AttributeCommonInfo &CI, Expr *E) {
D->addAttr(::new (Context) AlignValueAttr(Context, CI, E));
}
+static void handleWrapsAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
+ S.AddWrapsAttr(D, AL);
+}
+
+void Sema::AddWrapsAttr(Decl *D, const AttributeCommonInfo &CI) {
+ D->addAttr(::new (Context) WrapsAttr(Context, CI));
+}
+
static void handleAlignedAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
if (AL.hasParsedType()) {
const ParsedType &TypeArg = AL.getTypeArg();
@@ -9683,10 +9691,12 @@ ProcessDeclAttribute(Sema &S, Scope *scope, Decl *D, const ParsedAttr &AL,
case ParsedAttr::AT_AvailableOnlyInDefaultEvalMethod:
handleAvailableOnlyInDefaultEvalMethod(S, D, AL);
break;
-
case ParsedAttr::AT_CountedBy:
handleCountedByAttrField(S, D, AL);
break;
+ case ParsedAttr::AT_Wraps:
+ handleWrapsAttr(S, D, AL);
+ break;
// Microsoft attributes:
case ParsedAttr::AT_LayoutVersion:
diff --git a/clang/lib/Sema/SemaType.cpp b/clang/lib/Sema/SemaType.cpp
index d7521a5363a3d2..e2dc290cb7fb82 100644
--- a/clang/lib/Sema/SemaType.cpp
+++ b/clang/lib/Sema/SemaType.cpp
@@ -6940,6 +6940,18 @@ static void HandleBTFTypeTagAttribute(QualType &Type, const ParsedAttr &Attr,
::new (Ctx) BTFTypeTagAttr(Ctx, Attr, BTFTypeTag), Type);
}
+static void handleWrapsAttr(QualType &Type, const ParsedAttr &Attr,
+ TypeProcessingState &State) {
+ Sema &S = State.getSema();
+ ASTContext &Ctx = S.Context;
+
+ if (!Type->isIntegerType())
+ S.Diag(Attr.getLoc(), diag::warn_wraps_attr_var_decl_type_not_integer)
+ << Type.getAsString();
+
+ Type = State.getAttributedType(::new (Ctx) WrapsAttr(Ctx, Attr), Type, Type);
+}
+
/// HandleAddressSpaceTypeAttribute - Process an address_space attribute on the
/// specified type. The attribute contains 1 argument, the id of the address
/// space for the type.
@@ -8929,6 +8941,9 @@ static void processTypeAttrs(TypeProcessingState &state, QualType &type,
HandleBTFTypeTagAttribute(type, attr, state);
attr.setUsedAsTypeAttr();
break;
+ case ParsedAttr::AT_Wraps:
+ handleWrapsAttr(type, attr, state);
+ break;
case ParsedAttr::AT_MayAlias:
// FIXME: This attribute needs to actually be handled, but if we ignore
diff --git a/clang/test/CodeGen/integer-overflow.c b/clang/test/CodeGen/integer-overflow.c
index 461b026d39615b..44c42ed9efe577 100644
--- a/clang/test/CodeGen/integer-overflow.c
+++ b/clang/test/CodeGen/integer-overflow.c
@@ -105,3 +105,59 @@ void test1(void) {
// TRAPV: call ptr @llvm.frameaddress.p0(i32 0)
// CATCH_UB: call ptr @llvm.frameaddress.p0(i32 0)
}
+
+// Tests for integer overflow using __attribute__((wraps))
+typedef int __attribute__((wraps)) wrapping_int;
+
+void test2(void) {
+ // DEFAULT-LABEL: define{{.*}} void @test2
+ // WRAPV-LABEL: define{{.*}} void @test2
+ // TRAPV-LABEL: define{{.*}} void @test2
+ extern volatile wrapping_int a, b, c;
+
+ // Basically, all cases should match the WRAPV case since this attribute
+ // effectively enables wrapv for expressions containing wrapping types.
+
+ // DEFAULT,WRAPV,TRAPV,CATCH_UB,TRAPV_HANDLER: add i32
+ a = b + c;
+
+ // DEFAULT,WRAPV,TRAPV,CATCH_UB,TRAPV_HANDLER: sub i32
+ a = b - c;
+
+ // DEFAULT,WRAPV,TRAPV,CATCH_UB,TRAPV_HANDLER: mul i32
+ a = b * c;
+
+ // DEFAULT,WRAPV,TRAPV,CATCH_UB,TRAPV_HANDLER: sub i32 0,
+ a = -b;
+
+ // DEFAULT,WRAPV,TRAPV,CATCH_UB,TRAPV_HANDLER: add i32 {{.*}}, 1
+ ++b;
+
+ // DEFAULT,WRAPV,TRAPV,CATCH_UB,TRAPV_HANDLER: add i32 {{.*}}, -1
+ --b;
+
+ // Less trivial cases
+ extern volatile wrapping_int u, v;
+ extern volatile int w;
+
+ // DEFAULT,WRAPV,TRAPV,CATCH_UB,TRAPV_HANDLER: add i32
+ if (u + v < u) {}
+
+ // DEFAULT,WRAPV,TRAPV,CATCH_UB,TRAPV_HANDLER: add i32
+ for (;u + v < u;) {}
+
+ // this (w+1) should have instrumentation
+ // DEFAULT,WRAPV,TRAPV,CATCH_UB,TRAPV_HANDLER: call {{.*}} @llvm.sadd.with.overflow.i32
+ u = (w+1) + v;
+
+ // no parts of this expression should have instrumentation
+ // DEFAULT,WRAPV,TRAPV,CATCH_UB,TRAPV_HANDLER: add i32 {{.*}}, 1
+ u = (v+1) + w;
+
+ // downcast off the wraps attribute
+ // DEFAULT,WRAPV,TRAPV,CATCH_UB,TRAPV_HANDLER: call { i32, i1 } @llvm.sadd.with.overflow.i32
+ u = (int) u + (int) v;
+
+ // DEFAULT,WRAPV,TRAPV,CATCH_UB,TRAPV_HANDLER: call { i32, i1 } @llvm.sadd.with.overflow.i32
+ u = (int) u + w;
+}
diff --git a/clang/test/CodeGen/unsigned-overflow.c b/clang/test/CodeGen/unsigned-overflow.c
index 6c2f0c1efc145e..471a06e5f...
[truncated]
|
cc: @MaskRay @vitalybuka (since they also reviewed #82432) |
Adding attributes to types as type sugar (not part of the canonical type) is generally problematic: non-canonical types are not reliably preserved throughout the compiler, particularly in cases involving templates. Can we use a type qualifier here? |
This comment was marked as outdated.
This comment was marked as outdated.
Because we're dealing specifically with integers, which are a pretty limited class, you could consider introducing new types into the type system instead. Similar to how ext_vector_type works. That's maybe easier than qualifiers in the sense that the code already deals with a bunch of arithmetic types... but you probably still have to touch a lot of places to get everything working. Attributes on typedefs that don't correspond to a canonical type, like noderef/aligned/may_alias, are a constant source of issues; the user can't really tell if the attribute is actually working, or the compiler silently ate it. My team spent a lot of time dealing with a user having trouble with the aligned attribute... they way they were using it was completely broken, but they didn't know it was broken until a new compiler started optimizing more aggressively. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sema.h
changes look good to me.
6cac20e
to
0f7e8c9
Compare
Hi, I've made some changes and am looking for some more review on this PR: |
This now passes my behavioral testing suite for wrapping; yay! (The earlier version didn't cover truncate, so this is very nice now.) |
✅ With the latest revision this PR passed the C/C++ code formatter. |
The most recent commits have massively simplified checking for the wrapping attributes. FWIW, the tests I am running are: Are there any other things we need to get this over the hill? |
Does this list address all of @efriedma-quic's concerns about adding an attribute to a type? |
Forbidding usage in C++ probably avoids the worst of the canonical-type issues, but there's still some potential for weird results. Particularly with type merging; for example, if you write |
I had a similar question in the general case: is the annotation contagious?
The |
Yes, it's "contagious":
|
7b59be9
to
a898dec
Compare
Ping! I've rebased this PR onto bcd150d and fixed the issues with the docs not building because of faulty code-blocks. Bigger picture, for the Linux Kernel, the current plan is to only enable certain integer sanitizers if support for this wraps attribute exists -- so it currently stands as a blocker for further arithmetic safety in the Kernel. (paraphrasing @kees) With that being said, we're looking for more review on for this PR or a maintainer with push-access to merge :>) |
I'm still not happy with the AST representation here. The current representation is likely to cause unpredictable results in edge cases, and compatibility constraints mean whatever result the current version happens to produce for those cases is locked in forever. And there isn't really any good way to mitigate those issues under the current representation. I don't want to block progress here, but I think in its current form, we're likely to end up with reported issues we can't fix. |
@efriedma-quic said:
The current design is heavily opt-in. So much so that this isn't just some compiler flag that you turn on. You have to go out of your way to design your code around this attribute, specifically marking things as wrapping with named types, etc. So I guess the questions are: Is the current description of this attribute not clear enough? Do we think users will be confused/surprised by its behavior? What edge cases are there (in C)? Can they be fixed right now so we don't have these compatibility issues in the future? |
If you have a select with both wrapping and non-wrapping operands, is the result wrapping? If you declare a variable as both wrapping and non-wrapping, is it wrapping? If you declare a function parameter both wrapping and non-wrapping, is it wrapping? If you assign to a wrapping variable, is the result wrapping? If you mark a The fact that with this patch, a wrapping int is "just" an int is both the strength and weakness: you don't need to write out all these interactions... but the result is just based on clang's internal preferences for preserving type sugar, which are not documented. |
The C standard is small enough that if you comb through the C grammar carefully, you can probably come up with explicitly rules for all the important constructs, and verify they work. There's probably still a long tail of weird interactions with current and future clang extensions, and the whole thing is still sort of fragile because it's built on undocumented rules for type sugar propagation, but maybe if we're careful enough with testing, we can solve most of the compatibility? I'd still prefer to have a distinct canonical type. |
Hi @efriedma-quic, thanks for dumping all these cases! I will respond to each one individually. I understand you think adding a type would be better so I am exploring that option (I am thinking something similar to extended vectors). But for now, I am trying to make this attribute as useful and compatible as possible.
I assume you're referring to a ternary statement? Currently, no. Although, I should add support for this similar to how BinaryOperators are considered wrapping if any of their operands are.
I am not sure how to do this. I am sure that with the magic of C anything is possible but I can't conceive a way to have a variable both have the attribute and not have the attribute (to be clear, there doesn't exist a
Same here, not sure what you mean.
It depends on the type of the assignee, which I discuss later with a wrap_int a;
int b = a; // b is not wrapping
wrap_int c = a; // c is wrapping
This concerns the
I don't think you can get a less-than-int from an equality operator as the result. With that being said, and for all BinaryOperators, the resulting type will be wrapping if any of the operands are wrapping.
Ah, nice corner case. This concerns the
In the cases where the attribute is "lost" it is similar to how const int a;
int b = a; // b is not const
// or
extern void foo(int some);
foo(a); // const-ness is stripped, @some is not const In the cases where the attribute sticks around, it follows these rules:
So, after reading all your cases my TODO list is:
|
Say you declare a global variable like the following:
Is "x" wrapping?
There are cases that don't involve truncation:
|
No. But I am confused, isn't this just shadowing a global variable with a lesser-scoped one. Are they the same? What behavior do we want here?
Arithmetic cannot be performed on less-than-ints so the implicit promotions are put in place which lose the attribute. I just worked on an update to carry the wraps attribute through implicit integer promotion. However, after making the patch and testing it I am not sure what we want. Do we want to cancel implicit promotions for less-than-ints that possess the wraps attribute (doesn't sound spec compliant)? Or, should wrapping things still undergo promotion but now wrap at different bounds than what their original type specified? If I make a wrapping char [-128, 127] and do some arithmetic on it it will be promoted to int (and still have the wraps attribute, with my patch) then proceed. In the example below, wrapchar x = 127;
int y = x*x*x*x*x; // result: -1321368961 The options are:
More on this, I don't see how changing implementation from attribute-based to type-based resolves any of these design decisions. Neither approach is necessarily easier to implement or to understand for developers, if the right decisions are made in terms of behavior then the attribute approach can be just as useful and predictable. |
Attributes mostly do modify the type. The few that don't, like "aligned" and "may_alias", are a constant source of problems because they get accidentally stripped off. (I don't have any personal experience with "noderef".) |
So do you believe the current approach is a no-go? I am happy to do whatever it takes to get this feature over the line but hear me out: Any way of implementing this feature is subject to corner cases and can be broken -- like a lot of stuff in C. For its specific use case, this attribute provides immense power and clarity to existing code bases; the Linux Kernel would benefit massively as we could then enable multiple arithmetic sanitizers. With my custom wraps-enabled compiler and a syzkaller instance I've already located dozens of potential bugs that could be fixed with this attribute! (and ~hundreds of others with the The tests made by @kees (mostly kernel-tailored) and the tests I've checked in with this PR all pass without regression to existing integer sanitizer uses. I'd love to hear more folk's opinions, too. With more feedback, we can make this feature and its documentation/testing as solid as possible. |
In an effort to appease the build bot, I am adding a edit: the warning is float __attribute__((wraps)) a; |
My thinking about this attribute tends to follow from my desire not to change the C type system, but rather to adjust the behavior of the sanitizers. This means that it is possible to still build the Linux kernel without the sanitizers (the build just ignores the attribute), or with (where the attribute now has meaning). I feel like adding new types is a much larger/different thing, as it ends up requiring that the core language has to do something specific about overflow, etc. Under the sanitizers, there is a well-defined behavioral modification for overflow. Having the attribute augment the sanitizer means the semantics of regular C remain unchanged. We could even go so far as making it an error to encounter the attribute when the sanitizers aren't enabled. |
Ping! Any chance we could land this soon? I know @efriedma-quic has some different ideas for the implementation of this feature. However, adding a new type is most likely overkill for what is needed out of @bwendling as you recently implemented |
The code looks fine to me. But you should wait for @efriedma-quic to chime in as he had concerns over the feature. |
238e324
to
1df2f52
Compare
FYI: I've rebased (+ squashed) and handled merge conflicts (mainly in ReleaseNotes.rst). This, of course, required a force push. |
Hi, gentle ping. @efriedma-quic, do you still have concerns with the current implementation (as an attribute attached to typedefs)? After limiting typedef int __attribute__((wraps)) wrap_int;
extern void foo(int b);
int main(void) {
wrap_int a;
foo(a); // wraps.c:12:7: warning: 'wraps' attribute may be implicitly discarded \
// when converted to 'int' [-Wimplicitly-discarded-wraps-attribute]
} This feature is obviously niche and requires opting in with actual source modifications, this is not something people will use accidentally. However, Eli, I hear your concerns about some of the non-obvious ways attributes can get lost. I've tried to remedy this with some warnings (as seen above) but, of course, the semantics of attribute persistence are what they are. Eli, if you think this is not good enough then I will try and redesign the feature as an extended type. Although, I reckon the C type system has just as many quirks and foot-guns as the Clang attribute system. |
490fe5c
to
6152bd2
Compare
af16c49
to
58df542
Compare
A good example of a Linux kernel contributor running into an overflow that should really have its type marked as wrapping : https://lore.kernel.org/all/[email protected]/ They could use this feature to mark the suspected field struct percpu_ref_data {
atomic_long_t count __attribute__((wraps));
...
};
// or mark the type `atomic_long_t` as wrapping
typedef atomic_t __attribute__((wraps)) atomic_long_t; Reviewers, can I get a sense of where we are at with this feature? It lets developers better express the behavior of arithmetic in their code. |
58df542
to
97de899
Compare
Regarding there not being a
Does it affect converting a number too large to fit in the target type?
|
I’m a little concerned about not allowing the attribute in C++ - the existence of other options in C++ does not mean they are an option (due to various and sundry restrictions of C++ version upgrades different projects have), but also you trivially end up in cases where header code is correct/safe in C, but undefined when included in C++. To me that seems like a significant footgun. |
Good question, the attribute is cast away (intentionally so). Additionally, sanitizers will warn of the overflow since the addition no longer contains a wraps-annotated type/variable. The result of the addition is the same regardless:
No, this attribute really doesn't modify any arithmetic. It just tells the sanitizer not to instrument (specifically the wrapping_int a = INT_MAX;
wrapping_short b = a;
short c = a;
printf("%d\n%hd\n%hd\n", a, b, c);
/*
* 2147483647
* -1
* -1
*/ |
Yes, great point. The attribute should support C++ fairly well right now but the frontend doesn't allow it to be enabled because I am not certain how the magics of the c++ type/attribute system will interact with it. I figured a good starting point is to get the attribute working with C. other comment in thread about avoiding issues by disabling c++ support: #86618 (comment) |
So looking at what it'd take for C++ support, I don't think it's worth it right now. We already don't parse it in the front-end under C++ so there's no risk of some behavior changing -- it cannot be defined at all right now there. Additionally, since this was introduced because of C's lack of operator overloading, it feels redundant to add this to C++. And finally, there is no C++ user that actually wants this attribute. So let's leave it as-is with only C support: that's where it's wanted, and that's where it will be exclusively used. If someone wants to add it in the future, it can happen then. |
e73df4d
to
06250aa
Compare
Signed-off-by: Justin Stitt <[email protected]>
06250aa
to
6fb8f68
Compare
…itizers w/ SSCLs (#107332) [Related RFC](https://discourse.llvm.org/t/rfc-support-globpattern-add-operator-to-invert-matches/80683/5?u=justinstitt) ### Summary Implement type-based filtering via [Sanitizer Special Case Lists](https://clang.llvm.org/docs/SanitizerSpecialCaseList.html) for the arithmetic overflow and truncation sanitizers. Currently, using the `type:` prefix with these sanitizers does nothing. I've hooked up the SSCL parsing with Clang codegen so that we don't emit the overflow/truncation checks if the arithmetic contains an ignored type. ### Usefulness You can craft ignorelists that ignore specific types that are expected to overflow or wrap-around. For example, to ignore `my_type` from `unsigned-integer-overflow` instrumentation: ```bash $ cat ignorelist.txt [unsigned-integer-overflow] type:my_type=no_sanitize $ cat foo.c typedef unsigned long my_type; void foo() { my_type a = ULONG_MAX; ++a; } $ clang foo.c -fsanitize=unsigned-integer-overflow -fsanitize-ignorelist=ignorelist.txt ; ./a.out // --> no sanitizer error ``` If a type is functionally intended to overflow, like [refcount_t](https://kernsec.org/wiki/index.php/Kernel_Protections/refcount_t) and its associated APIs in the Linux kernel, then this type filtering would prove useful for reducing sanitizer noise. Currently, the Linux kernel dealt with this by [littering](https://elixir.bootlin.com/linux/v6.10.8/source/include/linux/refcount.h#L139 ) `__attribute__((no_sanitize("signed-integer-overflow")))` annotations on all the `refcount_t` APIs. I think this serves as an example of how a codebase could be made cleaner. We could make custom types that are filtered out in an ignorelist, allowing for types to be more expressive -- without the need for annotations. This accomplishes a similar goal to #86618. Yet another use case for this type filtering is whitelisting. We could ignore _all_ types, save a few. ```bash $ cat ignorelist.txt [implicit-signed-integer-truncation] type:*=no_sanitize # ignore literally all types type:short=sanitize # except `short` $ cat bar.c // compile with -fsanitize=implicit-signed-integer-truncation void bar(int toobig) { char a = toobig; // not instrumented short b = toobig; // instrumented } ``` ### Other ways to accomplish the goal of sanitizer allowlisting/whitelisting * ignore list SSCL type support (this PR that you're reading) * [my sanitize-allowlist branch](main...JustinStitt:llvm-project:sanitize-allowlist) - this just implements a sibling flag `-fsanitize-allowlist=`, removing some of the double negative logic present with `skip`/`ignore` when trying to whitelist something. * [Glob Negation](https://discourse.llvm.org/t/rfc-support-globpattern-add-operator-to-invert-matches/80683) - Implement a negation operator to the GlobPattern class so the ignorelist query can use them to simulate allowlisting Please let me know which of the three options we like best. They are not necessarily mutually exclusive. Here's [another related PR](#86618) which implements a `wraps` attribute. This can accomplish a similar goal to this PR but requires in-source changes to codebases and also covers a wider variety of integer definedness problems. ### CCs @kees @vitalybuka @bwendling --------- Signed-off-by: Justin Stitt <[email protected]>
superseded by #115094 |
…itizers w/ SSCLs (llvm#107332) [Related RFC](https://discourse.llvm.org/t/rfc-support-globpattern-add-operator-to-invert-matches/80683/5?u=justinstitt) ### Summary Implement type-based filtering via [Sanitizer Special Case Lists](https://clang.llvm.org/docs/SanitizerSpecialCaseList.html) for the arithmetic overflow and truncation sanitizers. Currently, using the `type:` prefix with these sanitizers does nothing. I've hooked up the SSCL parsing with Clang codegen so that we don't emit the overflow/truncation checks if the arithmetic contains an ignored type. ### Usefulness You can craft ignorelists that ignore specific types that are expected to overflow or wrap-around. For example, to ignore `my_type` from `unsigned-integer-overflow` instrumentation: ```bash $ cat ignorelist.txt [unsigned-integer-overflow] type:my_type=no_sanitize $ cat foo.c typedef unsigned long my_type; void foo() { my_type a = ULONG_MAX; ++a; } $ clang foo.c -fsanitize=unsigned-integer-overflow -fsanitize-ignorelist=ignorelist.txt ; ./a.out // --> no sanitizer error ``` If a type is functionally intended to overflow, like [refcount_t](https://kernsec.org/wiki/index.php/Kernel_Protections/refcount_t) and its associated APIs in the Linux kernel, then this type filtering would prove useful for reducing sanitizer noise. Currently, the Linux kernel dealt with this by [littering](https://elixir.bootlin.com/linux/v6.10.8/source/include/linux/refcount.h#L139 ) `__attribute__((no_sanitize("signed-integer-overflow")))` annotations on all the `refcount_t` APIs. I think this serves as an example of how a codebase could be made cleaner. We could make custom types that are filtered out in an ignorelist, allowing for types to be more expressive -- without the need for annotations. This accomplishes a similar goal to llvm#86618. Yet another use case for this type filtering is whitelisting. We could ignore _all_ types, save a few. ```bash $ cat ignorelist.txt [implicit-signed-integer-truncation] type:*=no_sanitize # ignore literally all types type:short=sanitize # except `short` $ cat bar.c // compile with -fsanitize=implicit-signed-integer-truncation void bar(int toobig) { char a = toobig; // not instrumented short b = toobig; // instrumented } ``` ### Other ways to accomplish the goal of sanitizer allowlisting/whitelisting * ignore list SSCL type support (this PR that you're reading) * [my sanitize-allowlist branch](llvm/llvm-project@main...JustinStitt:llvm-project:sanitize-allowlist) - this just implements a sibling flag `-fsanitize-allowlist=`, removing some of the double negative logic present with `skip`/`ignore` when trying to whitelist something. * [Glob Negation](https://discourse.llvm.org/t/rfc-support-globpattern-add-operator-to-invert-matches/80683) - Implement a negation operator to the GlobPattern class so the ignorelist query can use them to simulate allowlisting Please let me know which of the three options we like best. They are not necessarily mutually exclusive. Here's [another related PR](llvm#86618) which implements a `wraps` attribute. This can accomplish a similar goal to this PR but requires in-source changes to codebases and also covers a wider variety of integer definedness problems. ### CCs @kees @vitalybuka @bwendling --------- Signed-off-by: Justin Stitt <[email protected]>
Intro
This attribute would allow for more granular control over what expressions can emit integer overflow warnings or integer overflow sanitizer errors.
Here are some examples:
copy-pasted from my RFC.
When compiling with
-fsanitize=signed-integer-overflow,unsigned-integer-overflow
:The next few examples will show some cases where sanitizers will still trip (by design):
Reasoning
The root cause of many security vulnerabilities in the Linux kernel is arithmetic overflow.
The currently available choices to control arithmetic overflow behavior are very limited. The instrumentation is "all or nothing" which lacks the granularity required for many cases.
Enabling or disabling a sanitizer like
-fsanitize=unsigned-integer-overflow
completely opts you in or out of instrumentation for all unsigned arithmetic. For functions where intentional wrap-around is performed you could use__attribute__((no_sanitize("unsigned-integer-overflow")))
to completely disable sanitizer instrumentation. This, however, is not ideal if your function is large and multi-purposed -- perhaps even containing loads of arithmetic that you don't want to disable instrumentation for. Getting more granular, we have compiler built-ins like__builtin_add_overflow()
or similar. However, in some codebases, refactoring existing arithmetic to utilize these overflow builtins may only serve to complicate code. For example, the Linux Kernel has, like lots of other projects, some agreed upon idioms which are understood by its developers and changing these patterns is frowned upon:"if there's some unsigned wrap-around checker that doesn't
understand this traditional way of doing overflow checking, that piece
of crap needs fixing." - Linus
This was in response to a patch that was trying to change the commonly accepted idiom:
base + offset < base
to utilize a builtin (via a macro) to silence sanitizer errors.Recently, there's been some effort by Kees, myself and others to reintroduce the signed and unsigned integer overflow sanitizers in the Linux Kernel. Upon turning these sanitizers back on (or for the case of signed-integer-overflow, making it work at all) we encountered plently of existing instances of integer overflow in the Kernel. However, there's some pushback when trying to alter the "traditional" way of doing things.
With this new wrapping attribute we can specifically craft types that disable overflow instrumentation, without modifying traditional and widely understood code patterns -- resulting in easier to read code.
To summarize the behavior:
Using
__attribute__((wraps))
on a typedef or variable declaration makes it a "wrapping" type or variable thereby disabling overflow instrumentation for either 1) arithmetic performed directly on wrapping variables or types or 2) arithmetic performed on the result of calculations containing a wrapping variable or type. Instrumentation is not disabled for calculations containing subexpressions wherein no wrapping variables are present.... and also, implicit-(un)signed-integer-truncation sanitizers and warnings are turned off for wrapping types/declarations (since 6794e45)
Other Notes
- not since 0f7e8c9[[wraps]]
and[[clang::wraps]]
are supported for C++11The wraps attribute cannot be applied to functions
The wraps attribute can be applied to member variables
Constant expressions containing a wrapping type or variable should not result in
-Winteger-overflow
warnings either.CCs
@kees @nickdesaulniers @bwendling @erichkeane @haoNoQ @apple-fcloutier @FlashSheridan