Skip to content

[Clang] Add wraps attribute (for granular integer overflow handling) #86618

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

JustinStitt
Copy link
Contributor

@JustinStitt JustinStitt commented Mar 26, 2024

Intro

This attribute would allow for more granular control over what expressions can emit integer overflow warnings or integer overflow sanitizer errors.

Here are some examples:

copy-pasted from my RFC.

When compiling with -fsanitize=signed-integer-overflow,unsigned-integer-overflow:

typedef int __attribute__((wraps)) wrapping_int;

wrapping_int A = INT_MAX;
++A; /* This will not trigger a runtime sanitizer error */
unsigned long addr __attribute__((wraps)); /* wraps can be added to variables */
unsigned long size; /* both operands don't need wraps, just one */
...
if (addr + size < addr) {
  // handle overflow gracefully, because the overflow sanitizer won't trip
}
...
typedef int __attribute__((wraps)) wrapping_int;

int a = (wrapping_int) INT_MAX + 1; // no sanitizer trip (or -Winteger-overflow)

The next few examples will show some cases where sanitizers will still trip (by design):

wrapping_int A = INT_MAX;
int B = 1;

A = INT_MAX + 1; // overflow sanitizer trips! (INT_MAX + 1) is computed and trips the sanitizer before assignment to A

B = (INT_MAX+B) * A; // overflow sanitizer trips!

B =  INT_MAX + B * A; // overflow sanitizer does NOT trip!
// This is because B * A is computed first, the result type carries the wraps attribute
// with it from A, then that result is used in summation with INT_MAX.

Reasoning

The root cause of many security vulnerabilities in the Linux kernel is arithmetic overflow.

The currently available choices to control arithmetic overflow behavior are very limited. The instrumentation is "all or nothing" which lacks the granularity required for many cases.

Enabling or disabling a sanitizer like -fsanitize=unsigned-integer-overflow completely opts you in or out of instrumentation for all unsigned arithmetic. For functions where intentional wrap-around is performed you could use __attribute__((no_sanitize("unsigned-integer-overflow"))) to completely disable sanitizer instrumentation. This, however, is not ideal if your function is large and multi-purposed -- perhaps even containing loads of arithmetic that you don't want to disable instrumentation for. Getting more granular, we have compiler built-ins like __builtin_add_overflow() or similar. However, in some codebases, refactoring existing arithmetic to utilize these overflow builtins may only serve to complicate code. For example, the Linux Kernel has, like lots of other projects, some agreed upon idioms which are understood by its developers and changing these patterns is frowned upon:

"if there's some unsigned wrap-around checker that doesn't
understand this traditional way of doing overflow checking, that piece
of crap needs fixing." - Linus

This was in response to a patch that was trying to change the commonly accepted idiom: base + offset < base to utilize a builtin (via a macro) to silence sanitizer errors.

Recently, there's been some effort by Kees, myself and others to reintroduce the signed and unsigned integer overflow sanitizers in the Linux Kernel. Upon turning these sanitizers back on (or for the case of signed-integer-overflow, making it work at all) we encountered plently of existing instances of integer overflow in the Kernel. However, there's some pushback when trying to alter the "traditional" way of doing things.

With this new wrapping attribute we can specifically craft types that disable overflow instrumentation, without modifying traditional and widely understood code patterns -- resulting in easier to read code.

To summarize the behavior:

Using __attribute__((wraps)) on a typedef or variable declaration makes it a "wrapping" type or variable thereby disabling overflow instrumentation for either 1) arithmetic performed directly on wrapping variables or types or 2) arithmetic performed on the result of calculations containing a wrapping variable or type. Instrumentation is not disabled for calculations containing subexpressions wherein no wrapping variables are present.

... and also, implicit-(un)signed-integer-truncation sanitizers and warnings are turned off for wrapping types/declarations (since 6794e45)

typedef int __attribute__((wraps)) wrapping_int;

int main() {
  wrapping_int A = 2147483648; // no truncation warning!
}

Other Notes

  • [[wraps]] and [[clang::wraps]] are supported for C++11 - not since 0f7e8c9

  • The wraps attribute cannot be applied to functions

  • The wraps attribute can be applied to member variables

  • Constant expressions containing a wrapping type or variable should not result in -Winteger-overflow warnings either.

CCs

@kees @nickdesaulniers @bwendling @erichkeane @haoNoQ @apple-fcloutier @FlashSheridan

@JustinStitt JustinStitt requested a review from Endilll as a code owner March 26, 2024 02:17
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. labels Mar 26, 2024
@llvmbot
Copy link
Member

llvmbot commented Mar 26, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Author: Justin Stitt (JustinStitt)

Changes

Intro

This attribute would allow for more granular control over what expressions can emit integer overflow warnings or integer overflow sanitizer errors.

Here are some examples:

copy-pasted from my RFC.

When compiling with -fsanitize=signed-integer-overflow,unsigned-integer-overflow:

typedef int __attribute__((wraps)) wrapping_int;

wrapping_int A = INT_MAX;
++A; /* This will not trigger a runtime sanitizer error */
unsigned long addr __attribute__((wraps)); /* wraps can be added to variables */
unsigned long size; /* both operands don't need wraps, just one */
...
if (addr + size &lt; addr) {
  // handle overflow gracefully, because the overflow sanitizer won't trip
}
...
typedef int __attribute__((wraps)) wrapping_int;

int a = (wrapping_int) INT_MAX + 1; // no sanitizer trip (or -Winteger-overflow)

The next few examples will show some cases where sanitizers will still trip (by design):

wrapping_int A = INT_MAX;
int B = 1;

A = INT_MAX + 1; // overflow sanitizer trips! (INT_MAX + 1) is computed and trips the sanitizer before assignment to A

B = (INT_MAX+B) * A; // overflow sanitizer trips!

B =  INT_MAX + B * A; // overflow sanitizer does NOT trip!
// This is because B * A is computed first, the result type carries the wraps attribute
// with it from A, then that result is used in summation with INT_MAX.

Reasoning

The root cause of many security vulnerabilities in the Linux kernel is arithmetic overflow.

The currently available choices to control arithmetic overflow behavior are very limited. The instrumentation is "all or nothing" which lacks the granularity required for many cases.

Enabling or disabling a sanitizer like -fsanitize=unsigned-integer-overflow completely opts you in or out of instrumentation for all unsigned arithmetic. For functions where intentional wrap-around is performed you could use __attribute__((no_sanitize("unsigned-integer-overflow"))) to completely disable sanitizer instrumentation. This, however, is not ideal if your function is large and multi-purposed -- perhaps even containing loads of arithmetic that you don't want to disable instrumentation for. Getting more granular, we have compiler built-ins like __builtin_add_overflow() or similar. However, in some codebases, refactoring existing arithmetic to utilize these overflow builtins may only serve to complicate code. For example, the Linux Kernel has, like lots of other projects, some agreed upon idioms which are understood by its developers and changing these patterns is [frowned upon](https://lore.kernel.org/all/CAHk-=whS7FSbBoo1gxe+83twO2JeGNsUKMhAcfWymw9auqBvjg@mail.gmail.com):

"if there's some unsigned wrap-around checker that doesn't
understand this traditional way of doing overflow checking, that piece
of crap needs fixing." - Linus

This was in response to a patch that was trying to change the commonly accepted idiom: base + offset &lt; base to utilize a builtin (via a macro) to silence sanitizer errors.

Recently, there's been some effort by Kees, myself and others to reintroduce the signed and unsigned integer overflow sanitizers in the Linux Kernel. Upon turning these sanitizers back on (or for the case of signed-integer-overflow, making it work at all) we encountered plently of existing instances of integer overflow in the Kernel. However, there's some pushback when trying to alter the "traditional" way of doing things.

With this new wrapping attribute we can specifically craft types that disable overflow instrumentation, without modifying traditional and widely understood code patterns -- resulting in easier to read code.

To summarize the behavior:

Using __attribute__((wraps)) on a typedef or variable declaration makes it a "wrapping" type or variable thereby disabling overflow instrumentation for either 1) arithmetic performed directly on wrapping variables or types or 2) arithmetic performed on the result of calculations containing a wrapping variable or type. Instrumentation is not disabled for calculations containing subexpressions wherein no wrapping variables are present.

Other Notes

  • [[wraps]] and [[clang::wraps]] are supported for C++11

  • The wraps attribute cannot be applied to functions

  • The wraps attribute can be applied to member variables

  • Constant expressions containing a wrapping type or variable should not result in -Winteger-overflow warnings either.

CCs

@kees @nickdesaulniers @bwendling @erichkeane @haoNoQ @apple-fcloutier @FlashSheridan


Patch is 25.42 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/86618.diff

15 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+7)
  • (modified) clang/include/clang/AST/Expr.h (+3)
  • (modified) clang/include/clang/Basic/Attr.td (+6)
  • (modified) clang/include/clang/Basic/AttrDocs.td (+66)
  • (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (+3)
  • (modified) clang/include/clang/Sema/Sema.h (+4)
  • (modified) clang/lib/AST/Expr.cpp (+19)
  • (modified) clang/lib/AST/ExprConstant.cpp (+4-2)
  • (modified) clang/lib/AST/TypePrinter.cpp (+3)
  • (modified) clang/lib/CodeGen/CGExprScalar.cpp (+35-5)
  • (modified) clang/lib/Sema/SemaDeclAttr.cpp (+11-1)
  • (modified) clang/lib/Sema/SemaType.cpp (+15)
  • (modified) clang/test/CodeGen/integer-overflow.c (+56)
  • (modified) clang/test/CodeGen/unsigned-overflow.c (+55-8)
  • (added) clang/test/Sema/attr-wraps.c (+9)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 7fbe2fec6ca065..20bb9815830592 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -248,6 +248,13 @@ Attribute Changes in Clang
   added a new extension query ``__has_extension(swiftcc)`` corresponding to the
   ``__attribute__((swiftcc))`` attribute.
 
+- Introduced ``__attribute((wraps))`` or ``[[wraps]]`` which can be added to
+  type or variable declarations. Using an attributed type or variable in an
+  arithmetic expression will define the overflow behavior for that expression
+  as having two's complement wrap-around. These expressions cannot trigger
+  integer overflow warnings or sanitizer warnings. They also cannot be
+  optimized away by some eager UB optimizations.
+
 Improvements to Clang's diagnostics
 -----------------------------------
 - Clang now applies syntax highlighting to the code snippets it
diff --git a/clang/include/clang/AST/Expr.h b/clang/include/clang/AST/Expr.h
index 6e153ebe024b42..934146e8a182bc 100644
--- a/clang/include/clang/AST/Expr.h
+++ b/clang/include/clang/AST/Expr.h
@@ -4084,6 +4084,9 @@ class BinaryOperator : public Expr {
   static unsigned sizeOfTrailingObjects(bool HasFPFeatures) {
     return HasFPFeatures * sizeof(FPOptionsOverride);
   }
+
+  /// Do one of the subexpressions have the wraps attribute?
+  bool oneOfWraps(const ASTContext &Ctx) const;
 };
 
 /// CompoundAssignOperator - For compound assignments (e.g. +=), we keep
diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td
index 3e03e55612645b..0ea7755791d82e 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -4496,3 +4496,9 @@ def CodeAlign: StmtAttr {
     static constexpr int MaximumAlignment = 4096;
   }];
 }
+
+def Wraps : DeclOrTypeAttr {
+  let Spellings = [Clang<"wraps">, CXX11<"", "wraps", 202403>];
+  let Subjects = SubjectList<[Var, TypedefName, Field]>;
+  let Documentation = [WrapsDocs];
+}
diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td
index 9de14f608fd114..af662702edcffa 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -8005,3 +8005,69 @@ requirement:
   }
   }];
 }
+
+def WrapsDocs : Documentation {
+  let Category = DocCatField;
+  let Content = [{
+This attribute can be used with type or variable declarations to denote that
+arithmetic containing these marked components have defined overflow behavior.
+Specifically, the behavior is defined as being consistent with two's complement
+wrap-around. For the purposes of sanitizers or warnings that concern themselves
+with the definedness of integer arithmetic, they will cease to instrument or
+warn about arithmetic that directly involves a "wrapping" component.
+
+For example, ``-fsanitize=signed-integer-overflow`` or ``-Winteger-overflow``
+will not warn about suspicious overflowing arithmetic -- assuming correct usage
+of the wraps attribute.
+
+This example shows some basic usage of ``__attribute__((wraps))`` on a type
+definition when building with ``-fsanitize=signed-integer-overflow``
+
+.. code-block:: c
+  typedef int __attribute__((wraps)) wrapping_int;
+
+  void foo() {
+    wrapping_int a = INT_MAX;
+    ++a; // no sanitizer warning
+  }
+
+  int main() { foo(); }
+
+In the following example, we use ``__attribute__((wraps))`` on a variable to
+disable overflow instrumentation for arithmetic expressions it appears in. We
+do so with a popular overflow-checking pattern which we might not want to trip
+sanitizers (like ``-fsanitize=unsigned-integer-overflow``).
+
+.. code-block:: c
+  void foo(int offset) {
+    unsigned int A __attribute__((wraps)) = UINT_MAX;
+ 
+    // to check for overflow using this pattern, we may perform a real overflow
+    // thus triggering sanitizers to step in. Since A is "wrapping", we can be
+    // sure there are no sanitizer warnings.
+    if (A + offset < A) {
+      // handle overflow manually
+      // ...
+      return;
+    }
+
+    // now, handle non-overflow case
+    // ...
+  }
+
+The above example demonstrates some of the power and elegance this attribute
+provides. We can use code patterns we are already familiar with (like ``if (x +
+y < x)``) while gaining control over the overflow behavior on a case-by-case
+basis.
+
+When combined with ``-fwrapv``, this attribute can still be applied as normal
+but has no function apart from annotating types and variables for readers. This
+is because ``-fwrapv`` defines all arithmetic as being "wrapping", rending this
+attribute's efforts redundant.
+
+When using this attribute without ``-fwrapv`` and without any sanitizers, it
+still has an impact on the definedness of arithmetic expressions containing
+wrapping components. Since the behavior of said expressions is now technically
+defined, the compiler will forgo some eager optimizations that are used on
+expressions containing UB.}];
+}
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index fc727cef9cd835..c6e0fec9856cd4 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6534,6 +6534,9 @@ def err_counted_by_attr_refer_to_union : Error<
 def note_flexible_array_counted_by_attr_field : Note<
   "field %0 declared here">;
 
+def warn_wraps_attr_var_decl_type_not_integer : Warning<
+  "using attribute 'wraps' with non-integer type '%0' has no function">;
+
 let CategoryName = "ARC Semantic Issue" in {
 
 // ARC-mode diagnostics.
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index 5ecd2f9eb2881f..1989bbfe7045f0 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -3809,6 +3809,10 @@ class Sema final {
   void AddAnnotationAttr(Decl *D, const AttributeCommonInfo &CI,
                          StringRef Annot, MutableArrayRef<Expr *> Args);
 
+  /// AddWrapsAttr - Adds the "wraps" attribute to a particular
+  /// declaration.
+  void AddWrapsAttr(Decl *D, const AttributeCommonInfo &CI);
+
   bool checkMSInheritanceAttrOnDefinition(CXXRecordDecl *RD, SourceRange Range,
                                           bool BestCase,
                                           MSInheritanceModel SemanticSpelling);
diff --git a/clang/lib/AST/Expr.cpp b/clang/lib/AST/Expr.cpp
index 6221ebd5c9b4e9..a9c5f02ddd4093 100644
--- a/clang/lib/AST/Expr.cpp
+++ b/clang/lib/AST/Expr.cpp
@@ -2243,6 +2243,21 @@ bool BinaryOperator::isNullPointerArithmeticExtension(ASTContext &Ctx,
   return true;
 }
 
+bool BinaryOperator::oneOfWraps(const ASTContext &Ctx) const {
+  llvm::SmallVector<Expr *, 2> Both = {getLHS(), getRHS()};
+
+  for (const Expr *oneOf : Both) {
+    if (!oneOf)
+      continue;
+    if (auto *TypePtr =
+            oneOf->IgnoreParenImpCasts()->getType().getTypePtrOrNull())
+      if (TypePtr->hasAttr(attr::Wraps)) {
+        return true;
+      }
+  }
+  return false;
+}
+
 SourceLocExpr::SourceLocExpr(const ASTContext &Ctx, SourceLocIdentKind Kind,
                              QualType ResultTy, SourceLocation BLoc,
                              SourceLocation RParenLoc,
@@ -4757,6 +4772,8 @@ BinaryOperator::BinaryOperator(const ASTContext &Ctx, Expr *lhs, Expr *rhs,
   if (hasStoredFPFeatures())
     setStoredFPFeatures(FPFeatures);
   setDependence(computeDependence(this));
+  if (oneOfWraps(Ctx))
+    setType(Ctx.getAttributedType(attr::Wraps, getType(), getType()));
 }
 
 BinaryOperator::BinaryOperator(const ASTContext &Ctx, Expr *lhs, Expr *rhs,
@@ -4774,6 +4791,8 @@ BinaryOperator::BinaryOperator(const ASTContext &Ctx, Expr *lhs, Expr *rhs,
   if (hasStoredFPFeatures())
     setStoredFPFeatures(FPFeatures);
   setDependence(computeDependence(this));
+  if (oneOfWraps(Ctx))
+    setType(Ctx.getAttributedType(attr::Wraps, getType(), getType()));
 }
 
 BinaryOperator *BinaryOperator::CreateEmpty(const ASTContext &C,
diff --git a/clang/lib/AST/ExprConstant.cpp b/clang/lib/AST/ExprConstant.cpp
index 592d43597dc1b4..29b7836c940a00 100644
--- a/clang/lib/AST/ExprConstant.cpp
+++ b/clang/lib/AST/ExprConstant.cpp
@@ -2775,7 +2775,8 @@ static bool CheckedIntArithmetic(EvalInfo &Info, const Expr *E,
   APSInt Value(Op(LHS.extend(BitWidth), RHS.extend(BitWidth)), false);
   Result = Value.trunc(LHS.getBitWidth());
   if (Result.extend(BitWidth) != Value) {
-    if (Info.checkingForUndefinedBehavior())
+    if (Info.checkingForUndefinedBehavior() &&
+        !E->getType().getTypePtr()->hasAttr(attr::Wraps))
       Info.Ctx.getDiagnostics().Report(E->getExprLoc(),
                                        diag::warn_integer_constant_overflow)
           << toString(Result, 10, Result.isSigned(), /*formatAsCLiteral=*/false,
@@ -13964,7 +13965,8 @@ bool IntExprEvaluator::VisitUnaryOperator(const UnaryOperator *E) {
     if (!Result.isInt()) return Error(E);
     const APSInt &Value = Result.getInt();
     if (Value.isSigned() && Value.isMinSignedValue() && E->canOverflow()) {
-      if (Info.checkingForUndefinedBehavior())
+      if (Info.checkingForUndefinedBehavior() &&
+          !E->getType().getTypePtr()->hasAttr(attr::Wraps))
         Info.Ctx.getDiagnostics().Report(E->getExprLoc(),
                                          diag::warn_integer_constant_overflow)
             << toString(Value, 10, Value.isSigned(), /*formatAsCLiteral=*/false,
diff --git a/clang/lib/AST/TypePrinter.cpp b/clang/lib/AST/TypePrinter.cpp
index 7032ff2f18468c..52042666cb82cc 100644
--- a/clang/lib/AST/TypePrinter.cpp
+++ b/clang/lib/AST/TypePrinter.cpp
@@ -1947,6 +1947,9 @@ void TypePrinter::printAttributedAfter(const AttributedType *T,
   case attr::AArch64SVEPcs: OS << "aarch64_sve_pcs"; break;
   case attr::AMDGPUKernelCall: OS << "amdgpu_kernel"; break;
   case attr::IntelOclBicc: OS << "inteloclbicc"; break;
+  case attr::Wraps:
+    OS << "wraps";
+    break;
   case attr::PreserveMost:
     OS << "preserve_most";
     break;
diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp
index 8536570087ad0f..7c6fe78454709f 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -146,6 +146,15 @@ struct BinOpInfo {
       return UnOp->getSubExpr()->getType()->isFixedPointType();
     return false;
   }
+
+  /// Does the BinaryOperator have the wraps attribute?
+  /// If so, we can ellide overflow sanitizer checks.
+  bool oneOfWraps() const {
+    const Type *TyPtr = E->getType().getTypePtrOrNull();
+    if (TyPtr)
+      return TyPtr->hasAttr(attr::Wraps);
+    return false;
+  }
 };
 
 static bool MustVisitNullValue(const Expr *E) {
@@ -724,6 +733,11 @@ class ScalarExprEmitter
 
   // Binary Operators.
   Value *EmitMul(const BinOpInfo &Ops) {
+    if ((Ops.Ty->isSignedIntegerOrEnumerationType() ||
+         Ops.Ty->isUnsignedIntegerType()) &&
+        Ops.oneOfWraps())
+      return Builder.CreateMul(Ops.LHS, Ops.RHS, "mul");
+
     if (Ops.Ty->isSignedIntegerOrEnumerationType()) {
       switch (CGF.getLangOpts().getSignedOverflowBehavior()) {
       case LangOptions::SOB_Defined:
@@ -2685,6 +2699,9 @@ ScalarExprEmitter::EmitScalarPrePostIncDec(const UnaryOperator *E, LValue LV,
   } else if (type->isIntegerType()) {
     QualType promotedType;
     bool canPerformLossyDemotionCheck = false;
+    BinOpInfo Ops = (createBinOpInfoFromIncDec(
+        E, value, isInc, E->getFPFeaturesInEffect(CGF.getLangOpts())));
+
     if (CGF.getContext().isPromotableIntegerType(type)) {
       promotedType = CGF.getContext().getPromotedIntegerType(type);
       assert(promotedType != type && "Shouldn't promote to the same type.");
@@ -2727,10 +2744,12 @@ ScalarExprEmitter::EmitScalarPrePostIncDec(const UnaryOperator *E, LValue LV,
       // Note that signed integer inc/dec with width less than int can't
       // overflow because of promotion rules; we're just eliding a few steps
       // here.
-    } else if (E->canOverflow() && type->isSignedIntegerOrEnumerationType()) {
+    } else if (E->canOverflow() && type->isSignedIntegerOrEnumerationType() &&
+               !Ops.oneOfWraps()) {
       value = EmitIncDecConsiderOverflowBehavior(E, value, isInc);
     } else if (E->canOverflow() && type->isUnsignedIntegerType() &&
-               CGF.SanOpts.has(SanitizerKind::UnsignedIntegerOverflow)) {
+               CGF.SanOpts.has(SanitizerKind::UnsignedIntegerOverflow) &&
+               !Ops.oneOfWraps()) {
       value = EmitOverflowCheckedBinOp(createBinOpInfoFromIncDec(
           E, value, isInc, E->getFPFeaturesInEffect(CGF.getLangOpts())));
     } else {
@@ -3504,7 +3523,8 @@ Value *ScalarExprEmitter::EmitDiv(const BinOpInfo &Ops) {
     if ((CGF.SanOpts.has(SanitizerKind::IntegerDivideByZero) ||
          CGF.SanOpts.has(SanitizerKind::SignedIntegerOverflow)) &&
         Ops.Ty->isIntegerType() &&
-        (Ops.mayHaveIntegerDivisionByZero() || Ops.mayHaveIntegerOverflow())) {
+        (Ops.mayHaveIntegerDivisionByZero() || Ops.mayHaveIntegerOverflow()) &&
+        !Ops.oneOfWraps()) {
       llvm::Value *Zero = llvm::Constant::getNullValue(ConvertType(Ops.Ty));
       EmitUndefinedBehaviorIntegerDivAndRemCheck(Ops, Zero, true);
     } else if (CGF.SanOpts.has(SanitizerKind::FloatDivideByZero) &&
@@ -3553,7 +3573,8 @@ Value *ScalarExprEmitter::EmitRem(const BinOpInfo &Ops) {
   if ((CGF.SanOpts.has(SanitizerKind::IntegerDivideByZero) ||
        CGF.SanOpts.has(SanitizerKind::SignedIntegerOverflow)) &&
       Ops.Ty->isIntegerType() &&
-      (Ops.mayHaveIntegerDivisionByZero() || Ops.mayHaveIntegerOverflow())) {
+      (Ops.mayHaveIntegerDivisionByZero() || Ops.mayHaveIntegerOverflow()) &&
+      !Ops.oneOfWraps()) {
     CodeGenFunction::SanitizerScope SanScope(&CGF);
     llvm::Value *Zero = llvm::Constant::getNullValue(ConvertType(Ops.Ty));
     EmitUndefinedBehaviorIntegerDivAndRemCheck(Ops, Zero, false);
@@ -3918,6 +3939,11 @@ Value *ScalarExprEmitter::EmitAdd(const BinOpInfo &op) {
       op.RHS->getType()->isPointerTy())
     return emitPointerArithmetic(CGF, op, CodeGenFunction::NotSubtraction);
 
+  if ((op.Ty->isSignedIntegerOrEnumerationType() ||
+       op.Ty->isUnsignedIntegerType()) &&
+      op.oneOfWraps())
+    return Builder.CreateAdd(op.LHS, op.RHS, "add");
+
   if (op.Ty->isSignedIntegerOrEnumerationType()) {
     switch (CGF.getLangOpts().getSignedOverflowBehavior()) {
     case LangOptions::SOB_Defined:
@@ -4074,6 +4100,10 @@ Value *ScalarExprEmitter::EmitFixedPointBinOp(const BinOpInfo &op) {
 Value *ScalarExprEmitter::EmitSub(const BinOpInfo &op) {
   // The LHS is always a pointer if either side is.
   if (!op.LHS->getType()->isPointerTy()) {
+    if ((op.Ty->isSignedIntegerOrEnumerationType() ||
+         op.Ty->isUnsignedIntegerType()) &&
+        op.oneOfWraps())
+      return Builder.CreateSub(op.LHS, op.RHS, "sub");
     if (op.Ty->isSignedIntegerOrEnumerationType()) {
       switch (CGF.getLangOpts().getSignedOverflowBehavior()) {
       case LangOptions::SOB_Defined:
@@ -4224,7 +4254,7 @@ Value *ScalarExprEmitter::EmitShl(const BinOpInfo &Ops) {
   bool SanitizeSignedBase = CGF.SanOpts.has(SanitizerKind::ShiftBase) &&
                             Ops.Ty->hasSignedIntegerRepresentation() &&
                             !CGF.getLangOpts().isSignedOverflowDefined() &&
-                            !CGF.getLangOpts().CPlusPlus20;
+                            !CGF.getLangOpts().CPlusPlus20 && !Ops.oneOfWraps();
   bool SanitizeUnsignedBase =
       CGF.SanOpts.has(SanitizerKind::UnsignedShiftBase) &&
       Ops.Ty->hasUnsignedIntegerRepresentation();
diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp
index 0a62c656d824ff..6bc8d017d2edaf 100644
--- a/clang/lib/Sema/SemaDeclAttr.cpp
+++ b/clang/lib/Sema/SemaDeclAttr.cpp
@@ -4413,6 +4413,14 @@ void Sema::AddAlignValueAttr(Decl *D, const AttributeCommonInfo &CI, Expr *E) {
   D->addAttr(::new (Context) AlignValueAttr(Context, CI, E));
 }
 
+static void handleWrapsAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
+  S.AddWrapsAttr(D, AL);
+}
+
+void Sema::AddWrapsAttr(Decl *D, const AttributeCommonInfo &CI) {
+  D->addAttr(::new (Context) WrapsAttr(Context, CI));
+}
+
 static void handleAlignedAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
   if (AL.hasParsedType()) {
     const ParsedType &TypeArg = AL.getTypeArg();
@@ -9683,10 +9691,12 @@ ProcessDeclAttribute(Sema &S, Scope *scope, Decl *D, const ParsedAttr &AL,
   case ParsedAttr::AT_AvailableOnlyInDefaultEvalMethod:
     handleAvailableOnlyInDefaultEvalMethod(S, D, AL);
     break;
-
   case ParsedAttr::AT_CountedBy:
     handleCountedByAttrField(S, D, AL);
     break;
+  case ParsedAttr::AT_Wraps:
+    handleWrapsAttr(S, D, AL);
+    break;
 
   // Microsoft attributes:
   case ParsedAttr::AT_LayoutVersion:
diff --git a/clang/lib/Sema/SemaType.cpp b/clang/lib/Sema/SemaType.cpp
index d7521a5363a3d2..e2dc290cb7fb82 100644
--- a/clang/lib/Sema/SemaType.cpp
+++ b/clang/lib/Sema/SemaType.cpp
@@ -6940,6 +6940,18 @@ static void HandleBTFTypeTagAttribute(QualType &Type, const ParsedAttr &Attr,
       ::new (Ctx) BTFTypeTagAttr(Ctx, Attr, BTFTypeTag), Type);
 }
 
+static void handleWrapsAttr(QualType &Type, const ParsedAttr &Attr,
+                            TypeProcessingState &State) {
+  Sema &S = State.getSema();
+  ASTContext &Ctx = S.Context;
+
+  if (!Type->isIntegerType())
+    S.Diag(Attr.getLoc(), diag::warn_wraps_attr_var_decl_type_not_integer)
+        << Type.getAsString();
+
+  Type = State.getAttributedType(::new (Ctx) WrapsAttr(Ctx, Attr), Type, Type);
+}
+
 /// HandleAddressSpaceTypeAttribute - Process an address_space attribute on the
 /// specified type.  The attribute contains 1 argument, the id of the address
 /// space for the type.
@@ -8929,6 +8941,9 @@ static void processTypeAttrs(TypeProcessingState &state, QualType &type,
       HandleBTFTypeTagAttribute(type, attr, state);
       attr.setUsedAsTypeAttr();
       break;
+    case ParsedAttr::AT_Wraps:
+      handleWrapsAttr(type, attr, state);
+      break;
 
     case ParsedAttr::AT_MayAlias:
       // FIXME: This attribute needs to actually be handled, but if we ignore
diff --git a/clang/test/CodeGen/integer-overflow.c b/clang/test/CodeGen/integer-overflow.c
index 461b026d39615b..44c42ed9efe577 100644
--- a/clang/test/CodeGen/integer-overflow.c
+++ b/clang/test/CodeGen/integer-overflow.c
@@ -105,3 +105,59 @@ void test1(void) {
   // TRAPV:    call ptr @llvm.frameaddress.p0(i32 0)
   // CATCH_UB: call ptr @llvm.frameaddress.p0(i32 0)
 }
+
+// Tests for integer overflow using __attribute__((wraps))
+typedef int __attribute__((wraps)) wrapping_int;
+
+void test2(void) {
+  // DEFAULT-LABEL: define{{.*}} void @test2
+  // WRAPV-LABEL: define{{.*}} void @test2
+  // TRAPV-LABEL: define{{.*}} void @test2
+  extern volatile wrapping_int a, b, c;
+
+  // Basically, all cases should match the WRAPV case since this attribute
+  // effectively enables wrapv for expressions containing wrapping types.
+
+  // DEFAULT,WRAPV,TRAPV,CATCH_UB,TRAPV_HANDLER: add i32
+  a = b + c;
+
+  // DEFAULT,WRAPV,TRAPV,CATCH_UB,TRAPV_HANDLER: sub i32
+  a = b - c;
+
+  // DEFAULT,WRAPV,TRAPV,CATCH_UB,TRAPV_HANDLER: mul i32
+  a = b * c;
+
+  // DEFAULT,WRAPV,TRAPV,CATCH_UB,TRAPV_HANDLER: sub i32 0,
+  a = -b;
+
+  // DEFAULT,WRAPV,TRAPV,CATCH_UB,TRAPV_HANDLER: add i32 {{.*}}, 1
+  ++b;
+
+  // DEFAULT,WRAPV,TRAPV,CATCH_UB,TRAPV_HANDLER: add i32 {{.*}}, -1
+  --b;
+
+  // Less trivial cases
+  extern volatile wrapping_int u, v;
+  extern volatile int w;
+
+  // DEFAULT,WRAPV,TRAPV,CATCH_UB,TRAPV_HANDLER: add i32
+  if (u + v < u) {}
+
+  // DEFAULT,WRAPV,TRAPV,CATCH_UB,TRAPV_HANDLER: add i32
+  for (;u + v < u;) {}
+
+  // this (w+1) should have instrumentation
+  // DEFAULT,WRAPV,TRAPV,CATCH_UB,TRAPV_HANDLER: call {{.*}} @llvm.sadd.with.overflow.i32
+  u = (w+1) + v;
+
+  // no parts of this expression should have instrumentation
+  // DEFAULT,WRAPV,TRAPV,CATCH_UB,TRAPV_HANDLER: add i32 {{.*}}, 1
+  u = (v+1) + w;
+
+  // downcast off the wraps attribute
+  // DEFAULT,WRAPV,TRAPV,CATCH_UB,TRAPV_HANDLER: call { i32, i1 } @llvm.sadd.with.overflow.i32
+  u = (int) u + (int) v;
+
+  // DEFAULT,WRAPV,TRAPV,CATCH_UB,TRAPV_HANDLER: call { i32, i1 } @llvm.sadd.with.overflow.i32
+  u = (int) u + w;
+}
diff --git a/clang/test/CodeGen/unsigned-overflow.c b/clang/test/CodeGen/unsigned-overflow.c
index 6c2f0c1efc145e..471a06e5f...
[truncated]

@JustinStitt
Copy link
Contributor Author

JustinStitt commented Mar 26, 2024

cc: @MaskRay @vitalybuka (since they also reviewed #82432)

@efriedma-quic
Copy link
Collaborator

Adding attributes to types as type sugar (not part of the canonical type) is generally problematic: non-canonical types are not reliably preserved throughout the compiler, particularly in cases involving templates.

Can we use a type qualifier here?

@JustinStitt

This comment was marked as outdated.

@efriedma-quic
Copy link
Collaborator

Because we're dealing specifically with integers, which are a pretty limited class, you could consider introducing new types into the type system instead. Similar to how ext_vector_type works. That's maybe easier than qualifiers in the sense that the code already deals with a bunch of arithmetic types... but you probably still have to touch a lot of places to get everything working.

Attributes on typedefs that don't correspond to a canonical type, like noderef/aligned/may_alias, are a constant source of issues; the user can't really tell if the attribute is actually working, or the compiler silently ate it. My team spent a lot of time dealing with a user having trouble with the aligned attribute... they way they were using it was completely broken, but they didn't know it was broken until a new compiler started optimizing more aggressively.

Copy link
Contributor

@Endilll Endilll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sema.h changes look good to me.

@JustinStitt JustinStitt force-pushed the wraps-attribute-binop branch from 6cac20e to 0f7e8c9 Compare April 9, 2024 22:21
@JustinStitt
Copy link
Contributor Author

Hi, I've made some changes and am looking for some more review on this PR:

  • This attribute no longer supports C++ as there are other solutions for that language 1 that allow for the same fine-grained wrapping control without changing syntax patterns.
  • Add bypass for implicit-(un)signed-integer-truncation sanitizers
  • Rebased onto eec41d2

@kees
Copy link
Contributor

kees commented Apr 10, 2024

This now passes my behavioral testing suite for wrapping; yay! (The earlier version didn't cover truncate, so this is very nice now.)

Copy link

github-actions bot commented Apr 10, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@JustinStitt
Copy link
Contributor Author

JustinStitt commented Apr 10, 2024

The most recent commits have massively simplified checking for the wrapping attributes.

FWIW, the tests I am running are:
$ llvm-lit clang/test/CodeGen/integer-overflow.c -v
$ llvm-lit clang/test/CodeGen/unsigned-overflow.c -v
$ llvm-lit clang/test/Sema/attr-wraps.c -v
...and some hand-rolled stuff I have.

Are there any other things we need to get this over the hill?

@bwendling
Copy link
Collaborator

Hi, I've made some changes and am looking for some more review on this PR:

  • This attribute no longer supports C++ as there are other solutions for that language 1 that allow for the same fine-grained wrapping control without changing syntax patterns.
  • Add bypass for implicit-(un)signed-integer-truncation sanitizers
  • Rebased onto eec41d2

Does this list address all of @efriedma-quic's concerns about adding an attribute to a type?

@efriedma-quic
Copy link
Collaborator

Forbidding usage in C++ probably avoids the worst of the canonical-type issues, but there's still some potential for weird results. Particularly with type merging; for example, if you write a ? (wrap_int)x : 1, is the result a wrapping type?

@bwendling
Copy link
Collaborator

Forbidding usage in C++ probably avoids the worst of the canonical-type issues, but there's still some potential for weird results. Particularly with type merging; for example, if you write a ? (wrap_int)x : 1, is the result a wrapping type?

I had a similar question in the general case: is the annotation contagious?

int foo(wrap_int x, int a, int b, int c) {
  if (x + a + b < c)
    return 0;
  return 1;
}

The x + a is checked, but is that result checked when added to b?

@JustinStitt
Copy link
Contributor Author

Forbidding usage in C++ probably avoids the worst of the canonical-type issues, but there's still some potential for weird results. Particularly with type merging; for example, if you write a ? (wrap_int)x : 1, is the result a wrapping type?

I had a similar question in the general case: is the annotation contagious?

int foo(wrap_int x, int a, int b, int c) {
  if (x + a + b < c)
    return 0;
  return 1;
}

The x + a is checked, but is that result checked when added to b?

Yes, it's "contagious":

|-FunctionDecl 0x55f64065a5d8 <line:6:1, line:10:1> line:6:5 foo 'int (wrap_int, int, int, int)'
| |-ParmVarDecl 0x55f64065a300 <col:9, col:18> col:18 used x 'wrap_int':'int'
| |-ParmVarDecl 0x55f64065a380 <col:21, col:25> col:25 used a 'int'
| |-ParmVarDecl 0x55f64065a400 <col:28, col:32> col:32 used b 'int'
| |-ParmVarDecl 0x55f64065a480 <col:35, col:39> col:39 used c 'int'
| `-CompoundStmt 0x55f64065a8a8 <col:42, line:10:1>
|   |-IfStmt 0x55f64065a858 <line:7:3, line:8:12>
|   | |-BinaryOperator 0x55f64065a808 <line:7:7, col:19> 'int __attribute__((wraps))':'int' '<'
|   | | |-BinaryOperator 0x55f64065a7b0 <col:7, col:15> 'int __attribute__((wraps))':'int' '+'
|   | | | |-BinaryOperator 0x55f64065a758 <col:7, col:11> 'int __attribute__((wraps))':'int' '+'
|   | | | | |-ImplicitCastExpr 0x55f64065a728 <col:7> 'wrap_int':'int' <LValueToRValue>
|   | | | | | `-DeclRefExpr 0x55f64065a6e8 <col:7> 'wrap_int':'int' lvalue ParmVar 0x55f64065a300 'x' 'wrap_int':'int'
|   | | | | `-ImplicitCastExpr 0x55f64065a740 <col:11> 'int' <LValueToRValue>
|   | | | |   `-DeclRefExpr 0x55f64065a708 <col:11> 'int' lvalue ParmVar 0x55f64065a380 'a' 'int'
|   | | | `-ImplicitCastExpr 0x55f64065a798 <col:15> 'int' <LValueToRValue>
|   | | |   `-DeclRefExpr 0x55f64065a778 <col:15> 'int' lvalue ParmVar 0x55f64065a400 'b' 'int'
|   | | `-ImplicitCastExpr 0x55f64065a7f0 <col:19> 'int' <LValueToRValue>
|   | |   `-DeclRefExpr 0x55f64065a7d0 <col:19> 'int' lvalue ParmVar 0x55f64065a480 'c' 'int'
|   | `-ReturnStmt 0x55f64065a848 <line:8:5, col:12>
|   |   `-IntegerLiteral 0x55f64065a828 <col:12> 'int' 0
|   `-ReturnStmt 0x55f64065a898 <line:9:3, col:10>
|     `-IntegerLiteral 0x55f64065a878 <col:10> 'int' 1

@JustinStitt JustinStitt force-pushed the wraps-attribute-binop branch from 7b59be9 to a898dec Compare April 22, 2024 20:09
@JustinStitt
Copy link
Contributor Author

Ping!

I've rebased this PR onto bcd150d and fixed the issues with the docs not building because of faulty code-blocks.

Bigger picture, for the Linux Kernel, the current plan is to only enable certain integer sanitizers if support for this wraps attribute exists -- so it currently stands as a blocker for further arithmetic safety in the Kernel. (paraphrasing @kees)

With that being said, we're looking for more review on for this PR or a maintainer with push-access to merge :>)

@efriedma-quic
Copy link
Collaborator

I'm still not happy with the AST representation here. The current representation is likely to cause unpredictable results in edge cases, and compatibility constraints mean whatever result the current version happens to produce for those cases is locked in forever. And there isn't really any good way to mitigate those issues under the current representation.

I don't want to block progress here, but I think in its current form, we're likely to end up with reported issues we can't fix.

@JustinStitt
Copy link
Contributor Author

@efriedma-quic said:

I'm still not happy with the AST representation here. The current representation is likely to cause unpredictable results in edge cases, and compatibility constraints mean whatever result the current version happens to produce for those cases is locked in forever. And there isn't really any good way to mitigate those issues under the current representation.

I don't want to block progress here, but I think in its current form, we're likely to end up with reported issues we can't fix.

The current design is heavily opt-in. So much so that this isn't just some compiler flag that you turn on. You have to go out of your way to design your code around this attribute, specifically marking things as wrapping with named types, etc.

So I guess the questions are: Is the current description of this attribute not clear enough? Do we think users will be confused/surprised by its behavior? What edge cases are there (in C)? Can they be fixed right now so we don't have these compatibility issues in the future?

@efriedma-quic
Copy link
Collaborator

If you have a select with both wrapping and non-wrapping operands, is the result wrapping? If you declare a variable as both wrapping and non-wrapping, is it wrapping? If you declare a function parameter both wrapping and non-wrapping, is it wrapping? If you assign to a wrapping variable, is the result wrapping? If you mark a short wrapping, is the type after promotion wrapping? If the operands of an equality operation are wrapping, is the resulting boolean wrapping? If you mark a bitfield wrapping, does it wrap?

The fact that with this patch, a wrapping int is "just" an int is both the strength and weakness: you don't need to write out all these interactions... but the result is just based on clang's internal preferences for preserving type sugar, which are not documented.

@efriedma-quic
Copy link
Collaborator

The C standard is small enough that if you comb through the C grammar carefully, you can probably come up with explicitly rules for all the important constructs, and verify they work. There's probably still a long tail of weird interactions with current and future clang extensions, and the whole thing is still sort of fragile because it's built on undocumented rules for type sugar propagation, but maybe if we're careful enough with testing, we can solve most of the compatibility?

I'd still prefer to have a distinct canonical type.

@JustinStitt
Copy link
Contributor Author

JustinStitt commented Apr 24, 2024

Hi @efriedma-quic, thanks for dumping all these cases! I will respond to each one individually.

I understand you think adding a type would be better so I am exploring that option (I am thinking something similar to extended vectors). But for now, I am trying to make this attribute as useful and compatible as possible.

If you have a select with both wrapping and non-wrapping operands, is the result wrapping?

I assume you're referring to a ternary statement? Currently, no. Although, I should add support for this similar to how BinaryOperators are considered wrapping if any of their operands are.

`-ConditionalOperator 0x559ddad37998 <col:4, col:15> 'int'
  |-ImplicitCastExpr 0x559ddad37950 <col:4> 'int' <IntegralCast>
  | `-ImplicitCastExpr 0x559ddad37938 <col:4> '_Bool' <LValueToRValue>
  |   `-DeclRefExpr 0x559ddad378d8 <col:4> 'volatile _Bool' lvalue Var 0x559ddad7a350 'cond' 'volatile _Bool'
  |-ImplicitCastExpr 0x559ddad37968 <col:11> 'wrap_int':'int' <LValueToRValue>
  | `-DeclRefExpr 0x559ddad378f8 <col:11> 'const wrap_int':'const int' lvalue Var 0x559ddad37780 'a' 'const wrap_int':'const int'
  `-ImplicitCastExpr 0x559ddad37980 <col:15> 'int' <LValueToRValue>
    `-DeclRefExpr 0x559ddad37918 <col:15> 'const int' lvalue Var 0x559ddad37838 'b' 'const int'

If you declare a variable as both wrapping and non-wrapping, is it wrapping?

I am not sure how to do this. I am sure that with the magic of C anything is possible but I can't conceive a way to have a variable both have the attribute and not have the attribute (to be clear, there doesn't exist a __attribute__((no-wraps)) currently)

If you declare a function parameter both wrapping and non-wrapping, is it wrapping?

Same here, not sure what you mean.

If you assign to a wrapping variable, is the result wrapping?

It depends on the type of the assignee, which I discuss later with a const analogy.

wrap_int a;

int b = a; // b is not wrapping

wrap_int c = a; // c is wrapping

If you mark a short wrapping, is the type after promotion wrapping?

This concerns the -fsanitize=implicit-signed-integer-truncation sanitizer which the wraps attribute disables. So the type boundaries pre- and post-promotion are not enforced by this sanitizer for wrapping types.

If the operands of an equality operation are wrapping, is the resulting boolean wrapping?

I don't think you can get a less-than-int from an equality operator as the result. With that being said, and for all BinaryOperators, the resulting type will be wrapping if any of the operands are wrapping.

    `-IfStmt 0x56518e85e710 <line:9:3, line:11:3>
      |-BinaryOperator 0x56518e85e6e0 <line:9:7, col:12> 'int __attribute__((wraps))':'int' '=='
      | |-ImplicitCastExpr 0x56518e85e6b0 <col:7> 'wrap_int':'int' <LValueToRValue>
      | | `-DeclRefExpr 0x56518e85e670 <col:7> 'wrap_int':'int' lvalue Var 0x56518e85e560 'a' 'wrap_int':'int'
      | `-ImplicitCastExpr 0x56518e85e6c8 <col:12> 'wrap_int':'int' <LValueToRValue>
      |   `-DeclRefExpr 0x56518e85e690 <col:12> 'wrap_int':'int' lvalue Var 0x56518e85e5d8 'b' 'wrap_int':'int'
      `-CompoundStmt 0x56518e85e700 <col:15, line:11:3>


If you mark a bitfield wrapping, does it wrap?

Ah, nice corner case. This concerns the -fsanitize=implicit-bitfield-conversion sanitizer and is something I did not consider. So, currently the wraps attribute won't do anything on bitfields. I could add support for disabling this sanitizer for wrapping bitfields.

The fact that with this patch, a wrapping int is "just" an int is both the strength and weakness: you don't need to write out all these interactions... but the result is just based on clang's internal preferences for preserving type sugar, which are not documented.

In the cases where the attribute is "lost" it is similar to how const qualifiers disappear during assignments/type changes.

const int a;
int b = a; // b is not const
// or 
extern void foo(int some);
foo(a); // const-ness is stripped, @some is not const

In the cases where the attribute sticks around, it follows these rules:

  • Persists through Unary operators 👍
  • If any operand within a BinaryOperator expression are wrapping then the entire expression result is considered wrapping.
  • When used in assignments or passed as arguments to a function, the wrapping attribute will be present if the left-hand side or the function argument is wrapping. (similar to const, I suppose).

So, after reading all your cases my TODO list is:

  • add support for ternary operator results gaining the wrapping attribute if any of its operands have it
  • add support for disabling -fsanitize=implicit-bitfield-conversion when bitfields are marked as wrapping.

@efriedma-quic
Copy link
Collaborator

efriedma-quic commented Apr 24, 2024

If you declare a variable as both wrapping and non-wrapping, is it wrapping?

I am not sure how to do this. I am sure that with the magic of C anything is possible but I can't conceive a way to have a variable both have the attribute and not have the attribute (to be clear, there doesn't exist a __attribute__((no-wraps)) currently)

Say you declare a global variable like the following:

typedef int wrapint __attribute((wraps));
extern wrapint x;
int x = 0;

Is "x" wrapping?

If you mark a short wrapping, is the type after promotion wrapping?

This concerns the -fsanitize=implicit-signed-integer-truncation sanitizer which the wraps attribute disables. So the type boundaries pre- and post-promotion are not enforced by this sanitizer for wrapping types.

There are cases that don't involve truncation:

typedef short wrapshort __attribute((wraps));
wrapshort x = 0x7FFF;
int y = x*x*x*x;

@JustinStitt
Copy link
Contributor Author

If you declare a variable as both wrapping and non-wrapping, is it wrapping?

I am not sure how to do this. I am sure that with the magic of C anything is possible but I can't conceive a way to have a variable both have the attribute and not have the attribute (to be clear, there doesn't exist a __attribute__((no-wraps)) currently)

Say you declare a global variable like the following:

typedef int wrapint __attribute((wraps));
extern wrapint x;
int x = 0;

Is "x" wrapping?

No. But I am confused, isn't this just shadowing a global variable with a lesser-scoped one. Are they the same? What behavior do we want here?

If you mark a short wrapping, is the type after promotion wrapping?

This concerns the -fsanitize=implicit-signed-integer-truncation sanitizer which the wraps attribute disables. So the type boundaries pre- and post-promotion are not enforced by this sanitizer for wrapping types.

There are cases that don't involve truncation:

typedef short wrapshort __attribute((wraps));
wrapshort x = 0x7FFF;
int y = x*x*x*x;

Arithmetic cannot be performed on less-than-ints so the implicit promotions are put in place which lose the attribute. I just worked on an update to carry the wraps attribute through implicit integer promotion.

However, after making the patch and testing it I am not sure what we want. Do we want to cancel implicit promotions for less-than-ints that possess the wraps attribute (doesn't sound spec compliant)? Or, should wrapping things still undergo promotion but now wrap at different bounds than what their original type specified?

If I make a wrapping char [-128, 127] and do some arithmetic on it it will be promoted to int (and still have the wraps attribute, with my patch) then proceed.

In the example below, x*x*x*x*x is 33,038,369,407 which is greater than INT_MAX 2,147,483,647 and will thus overflow (a few times). However, this overflow is happening at the integer boundaries not at the char boundaries.

wrapchar x = 127;
int y = x*x*x*x*x; // result: -1321368961

The options are:

  1. don't carry the wraps attribute through implicit promotions
  2. carry the wraps attribute but use promoted-to-type's boundaries for wrapping
  3. carry attribute and use the pre-promotion type's boundaries

More on this, I don't see how changing implementation from attribute-based to type-based resolves any of these design decisions. Neither approach is necessarily easier to implement or to understand for developers, if the right decisions are made in terms of behavior then the attribute approach can be just as useful and predictable.

@efriedma-quic
Copy link
Collaborator

Attributes mostly do modify the type. The few that don't, like "aligned" and "may_alias", are a constant source of problems because they get accidentally stripped off. (I don't have any personal experience with "noderef".)

@JustinStitt
Copy link
Contributor Author

@efriedma-quic:

Attributes mostly do modify the type. The few that don't, like "aligned" and "may_alias", are a constant source of problems because they get accidentally stripped off. (I don't have any personal experience with "noderef".)

So do you believe the current approach is a no-go? I am happy to do whatever it takes to get this feature over the line but hear me out:

Any way of implementing this feature is subject to corner cases and can be broken -- like a lot of stuff in C. For its specific use case, this attribute provides immense power and clarity to existing code bases; the Linux Kernel would benefit massively as we could then enable multiple arithmetic sanitizers. With my custom wraps-enabled compiler and a syzkaller instance I've already located dozens of potential bugs that could be fixed with this attribute! (and ~hundreds of others with the unsigned-integer-overflow and implicit-integer-truncation sanitizers -- which I have yet to fuzz with).

The tests made by @kees (mostly kernel-tailored) and the tests I've checked in with this PR all pass without regression to existing integer sanitizer uses.

I'd love to hear more folk's opinions, too. With more feedback, we can make this feature and its documentation/testing as solid as possible.

@JustinStitt
Copy link
Contributor Author

JustinStitt commented Apr 26, 2024

In an effort to appease the build bot, I am adding a -W flag to turn of warn_wraps_attr_var_decl_type_not_integer which yells at users for trying to add __attribute__((wraps)) to things that don't really wrap in the traditional sense.

edit: the warning is -W{no-}useless-wraps-attribute which will enable/disable warnings in situations like this:

float __attribute__((wraps)) a;

@kees
Copy link
Contributor

kees commented Apr 29, 2024

My thinking about this attribute tends to follow from my desire not to change the C type system, but rather to adjust the behavior of the sanitizers. This means that it is possible to still build the Linux kernel without the sanitizers (the build just ignores the attribute), or with (where the attribute now has meaning). I feel like adding new types is a much larger/different thing, as it ends up requiring that the core language has to do something specific about overflow, etc. Under the sanitizers, there is a well-defined behavioral modification for overflow. Having the attribute augment the sanitizer means the semantics of regular C remain unchanged. We could even go so far as making it an error to encounter the attribute when the sanitizers aren't enabled.

@JustinStitt
Copy link
Contributor Author

Ping!

Any chance we could land this soon?

I know @efriedma-quic has some different ideas for the implementation of this feature. However, adding a new type is most likely overkill for what is needed out of wraps. I think the implementation/documentation and testing has landed in a pretty nice spot but am eager to hear what others have to think as well.

@bwendling as you recently implemented __counted_by is there anything you see missing with wraps (considering there's some overlap between the feature sets)?

@bwendling
Copy link
Collaborator

The code looks fine to me. But you should wait for @efriedma-quic to chime in as he had concerns over the feature.

@JustinStitt JustinStitt force-pushed the wraps-attribute-binop branch 2 times, most recently from 238e324 to 1df2f52 Compare May 20, 2024 18:43
@JustinStitt
Copy link
Contributor Author

JustinStitt commented May 20, 2024

FYI: I've rebased (+ squashed) and handled merge conflicts (mainly in ReleaseNotes.rst). This, of course, required a force push.

@JustinStitt
Copy link
Contributor Author

Hi, gentle ping.

@efriedma-quic, do you still have concerns with the current implementation (as an attribute attached to typedefs)? After limiting wraps to C and adding warnings to let users know when the attribute may be stripped, I think it's in a good spot.

typedef int __attribute__((wraps)) wrap_int;

extern void foo(int b);

int main(void) {
  wrap_int a;
  foo(a); // wraps.c:12:7: warning: 'wraps' attribute may be implicitly discarded \
          // when converted to 'int' [-Wimplicitly-discarded-wraps-attribute]

}

This feature is obviously niche and requires opting in with actual source modifications, this is not something people will use accidentally. However, Eli, I hear your concerns about some of the non-obvious ways attributes can get lost. I've tried to remedy this with some warnings (as seen above) but, of course, the semantics of attribute persistence are what they are. Eli, if you think this is not good enough then I will try and redesign the feature as an extended type. Although, I reckon the C type system has just as many quirks and foot-guns as the Clang attribute system.

@JustinStitt JustinStitt force-pushed the wraps-attribute-binop branch from 490fe5c to 6152bd2 Compare June 13, 2024 21:30
@JustinStitt JustinStitt force-pushed the wraps-attribute-binop branch 2 times, most recently from af16c49 to 58df542 Compare June 28, 2024 00:06
@JustinStitt
Copy link
Contributor Author

A good example of a Linux kernel contributor running into an overflow that should really have its type marked as wrapping : https://lore.kernel.org/all/[email protected]/

They could use this feature to mark the suspected field percpu_ref->percpu_ref_data->count as wrapping:

struct percpu_ref_data {
	atomic_long_t		count __attribute__((wraps));
        ...
};

// or mark the type `atomic_long_t` as wrapping
typedef atomic_t __attribute__((wraps)) atomic_long_t;

Reviewers, can I get a sense of where we are at with this feature? It lets developers better express the behavior of arithmetic in their code.

@JustinStitt JustinStitt force-pushed the wraps-attribute-binop branch from 58df542 to 97de899 Compare August 20, 2024 21:43
@hnrklssn
Copy link
Member

hnrklssn commented Aug 21, 2024

Regarding there not being a no-wraps attribute. What happens with code like this? Is the attribute lost / casted away during the addition?

wrapping_int a = INT_MAX;
a = (int) a + 1;

Does it affect converting a number too large to fit in the target type?

wrapping_int a = INT_MAX;
wrapping_short b = a;
short c = a;

@ojhunt
Copy link
Contributor

ojhunt commented Aug 21, 2024

I’m a little concerned about not allowing the attribute in C++ - the existence of other options in C++ does not mean they are an option (due to various and sundry restrictions of C++ version upgrades different projects have), but also you trivially end up in cases where header code is correct/safe in C, but undefined when included in C++.

To me that seems like a significant footgun.

@JustinStitt
Copy link
Contributor Author

JustinStitt commented Aug 22, 2024

Regarding there not being a no-wraps attribute. What happens with code like this? Is the attribute lost / casted away during the addition?

wrapping_int a = INT_MAX;
a = (int) a + 1;

Good question, the attribute is cast away (intentionally so). Additionally, sanitizers will warn of the overflow since the addition no longer contains a wraps-annotated type/variable. The result of the addition is the same regardless: -2147483648

Does it affect converting a number too large to fit in the target type?

wrapping_int a = INT_MAX;
wrapping_short b = a;
short c = a;

No, this attribute really doesn't modify any arithmetic. It just tells the sanitizer not to instrument (specifically the -fsanitize=implicit-signed-integer-truncation sanitizer in this case)

  wrapping_int a = INT_MAX;
  wrapping_short b = a;
  short c = a;
  printf("%d\n%hd\n%hd\n", a, b, c);

/*
 * 2147483647
 * -1
 * -1
*/

@JustinStitt
Copy link
Contributor Author

I’m a little concerned about not allowing the attribute in C++ - the existence of other options in C++ does not mean they are an option (due to various and sundry restrictions of C++ version upgrades different projects have), but also you trivially end up in cases where header code is correct/safe in C, but undefined when included in C++.

To me that seems like a significant footgun.

Yes, great point. The attribute should support C++ fairly well right now but the frontend doesn't allow it to be enabled because I am not certain how the magics of the c++ type/attribute system will interact with it. I figured a good starting point is to get the attribute working with C.

other comment in thread about avoiding issues by disabling c++ support: #86618 (comment)

@kees
Copy link
Contributor

kees commented Sep 3, 2024

I’m a little concerned about not allowing the attribute in C++ - the existence of other options in C++ does not mean they are an option (due to various and sundry restrictions of C++ version upgrades different projects have), but also you trivially end up in cases where header code is correct/safe in C, but undefined when included in C++.
To me that seems like a significant footgun.

Yes, great point. The attribute should support C++ fairly well right now but the frontend doesn't allow it to be enabled because I am not certain how the magics of the c++ type/attribute system will interact with it. I figured a good starting point is to get the attribute working with C.

other comment in thread about avoiding issues by disabling c++ support: #86618 (comment)

So looking at what it'd take for C++ support, I don't think it's worth it right now. We already don't parse it in the front-end under C++ so there's no risk of some behavior changing -- it cannot be defined at all right now there. Additionally, since this was introduced because of C's lack of operator overloading, it feels redundant to add this to C++. And finally, there is no C++ user that actually wants this attribute.

So let's leave it as-is with only C support: that's where it's wanted, and that's where it will be exclusively used. If someone wants to add it in the future, it can happen then.

Signed-off-by: Justin Stitt <[email protected]>
vitalybuka pushed a commit that referenced this pull request Nov 4, 2024
…itizers w/ SSCLs (#107332)

[Related
RFC](https://discourse.llvm.org/t/rfc-support-globpattern-add-operator-to-invert-matches/80683/5?u=justinstitt)

### Summary

Implement type-based filtering via [Sanitizer Special Case
Lists](https://clang.llvm.org/docs/SanitizerSpecialCaseList.html) for
the arithmetic overflow and truncation sanitizers.

Currently, using the `type:` prefix with these sanitizers does nothing.
I've hooked up the SSCL parsing with Clang codegen so that we don't emit
the overflow/truncation checks if the arithmetic contains an ignored
type.

### Usefulness

You can craft ignorelists that ignore specific types that are expected
to overflow or wrap-around. For example, to ignore `my_type` from
`unsigned-integer-overflow` instrumentation:
```bash
$ cat ignorelist.txt
[unsigned-integer-overflow]
type:my_type=no_sanitize

$ cat foo.c
typedef unsigned long my_type;
void foo() {
  my_type a = ULONG_MAX;
  ++a;
}

$ clang foo.c -fsanitize=unsigned-integer-overflow -fsanitize-ignorelist=ignorelist.txt ; ./a.out
// --> no sanitizer error
```

If a type is functionally intended to overflow, like
[refcount_t](https://kernsec.org/wiki/index.php/Kernel_Protections/refcount_t)
and its associated APIs in the Linux kernel, then this type filtering
would prove useful for reducing sanitizer noise. Currently, the Linux
kernel dealt with this by
[littering](https://elixir.bootlin.com/linux/v6.10.8/source/include/linux/refcount.h#L139
) `__attribute__((no_sanitize("signed-integer-overflow")))` annotations
on all the `refcount_t` APIs. I think this serves as an example of how a
codebase could be made cleaner. We could make custom types that are
filtered out in an ignorelist, allowing for types to be more expressive
-- without the need for annotations. This accomplishes a similar goal to
#86618.


Yet another use case for this type filtering is whitelisting. We could
ignore _all_ types, save a few.

```bash
$ cat ignorelist.txt
[implicit-signed-integer-truncation]
type:*=no_sanitize # ignore literally all types
type:short=sanitize # except `short`

$ cat bar.c
// compile with -fsanitize=implicit-signed-integer-truncation
void bar(int toobig) {
  char a = toobig;  // not instrumented
  short b = toobig; // instrumented
}
```

### Other ways to accomplish the goal of sanitizer
allowlisting/whitelisting
* ignore list SSCL type support (this PR that you're reading)
* [my sanitize-allowlist
branch](main...JustinStitt:llvm-project:sanitize-allowlist)
- this just implements a sibling flag `-fsanitize-allowlist=`, removing
some of the double negative logic present with `skip`/`ignore` when
trying to whitelist something.
* [Glob
Negation](https://discourse.llvm.org/t/rfc-support-globpattern-add-operator-to-invert-matches/80683)
- Implement a negation operator to the GlobPattern class so the
ignorelist query can use them to simulate allowlisting


Please let me know which of the three options we like best. They are not
necessarily mutually exclusive.

Here's [another related
PR](#86618) which implements a
`wraps` attribute. This can accomplish a similar goal to this PR but
requires in-source changes to codebases and also covers a wider variety
of integer definedness problems.

### CCs
@kees @vitalybuka @bwendling

---------

Signed-off-by: Justin Stitt <[email protected]>
@JustinStitt
Copy link
Contributor Author

superseded by #115094

@JustinStitt JustinStitt closed this Nov 6, 2024
PhilippRados pushed a commit to PhilippRados/llvm-project that referenced this pull request Nov 6, 2024
…itizers w/ SSCLs (llvm#107332)

[Related
RFC](https://discourse.llvm.org/t/rfc-support-globpattern-add-operator-to-invert-matches/80683/5?u=justinstitt)

### Summary

Implement type-based filtering via [Sanitizer Special Case
Lists](https://clang.llvm.org/docs/SanitizerSpecialCaseList.html) for
the arithmetic overflow and truncation sanitizers.

Currently, using the `type:` prefix with these sanitizers does nothing.
I've hooked up the SSCL parsing with Clang codegen so that we don't emit
the overflow/truncation checks if the arithmetic contains an ignored
type.

### Usefulness

You can craft ignorelists that ignore specific types that are expected
to overflow or wrap-around. For example, to ignore `my_type` from
`unsigned-integer-overflow` instrumentation:
```bash
$ cat ignorelist.txt
[unsigned-integer-overflow]
type:my_type=no_sanitize

$ cat foo.c
typedef unsigned long my_type;
void foo() {
  my_type a = ULONG_MAX;
  ++a;
}

$ clang foo.c -fsanitize=unsigned-integer-overflow -fsanitize-ignorelist=ignorelist.txt ; ./a.out
// --> no sanitizer error
```

If a type is functionally intended to overflow, like
[refcount_t](https://kernsec.org/wiki/index.php/Kernel_Protections/refcount_t)
and its associated APIs in the Linux kernel, then this type filtering
would prove useful for reducing sanitizer noise. Currently, the Linux
kernel dealt with this by
[littering](https://elixir.bootlin.com/linux/v6.10.8/source/include/linux/refcount.h#L139
) `__attribute__((no_sanitize("signed-integer-overflow")))` annotations
on all the `refcount_t` APIs. I think this serves as an example of how a
codebase could be made cleaner. We could make custom types that are
filtered out in an ignorelist, allowing for types to be more expressive
-- without the need for annotations. This accomplishes a similar goal to
llvm#86618.


Yet another use case for this type filtering is whitelisting. We could
ignore _all_ types, save a few.

```bash
$ cat ignorelist.txt
[implicit-signed-integer-truncation]
type:*=no_sanitize # ignore literally all types
type:short=sanitize # except `short`

$ cat bar.c
// compile with -fsanitize=implicit-signed-integer-truncation
void bar(int toobig) {
  char a = toobig;  // not instrumented
  short b = toobig; // instrumented
}
```

### Other ways to accomplish the goal of sanitizer
allowlisting/whitelisting
* ignore list SSCL type support (this PR that you're reading)
* [my sanitize-allowlist
branch](llvm/llvm-project@main...JustinStitt:llvm-project:sanitize-allowlist)
- this just implements a sibling flag `-fsanitize-allowlist=`, removing
some of the double negative logic present with `skip`/`ignore` when
trying to whitelist something.
* [Glob
Negation](https://discourse.llvm.org/t/rfc-support-globpattern-add-operator-to-invert-matches/80683)
- Implement a negation operator to the GlobPattern class so the
ignorelist query can use them to simulate allowlisting


Please let me know which of the three options we like best. They are not
necessarily mutually exclusive.

Here's [another related
PR](llvm#86618) which implements a
`wraps` attribute. This can accomplish a similar goal to this PR but
requires in-source changes to codebases and also covers a wider variety
of integer definedness problems.

### CCs
@kees @vitalybuka @bwendling

---------

Signed-off-by: Justin Stitt <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants