Skip to content

[-Wunsafe-buffer-usage] Add alloc_size knowledge to the 2-param span constructor warning #114894

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 7, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
123 changes: 111 additions & 12 deletions clang/lib/Analysis/UnsafeBufferUsage.cpp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very interesting. Potentially an API that could be exposed to other analyses too!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please rebase this PR? The isSafeSpanTwoParamConstruct matcher has changes on the main branch that are not present here. For instance, case 6 is now used by :std::span<T>{std::addressof(...), 1} on the main branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup!

Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,9 @@
//===----------------------------------------------------------------------===//

#include "clang/Analysis/Analyses/UnsafeBufferUsage.h"
#include "clang/AST/APValue.h"
#include "clang/AST/ASTContext.h"
#include "clang/AST/Attr.h"
#include "clang/AST/Decl.h"
#include "clang/AST/DynamicRecursiveASTVisitor.h"
#include "clang/AST/Expr.h"
Expand Down Expand Up @@ -353,23 +355,90 @@ isInUnspecifiedUntypedContext(internal::Matcher<Stmt> InnerMatcher) {
return stmt(anyOf(CompStmt, IfStmtThen, IfStmtElse));
}

// Returns true iff integer E1 is equivalent to integer E2.
//
// For now we only support such expressions:
// expr := DRE | const-value | expr BO expr
// BO := '*' | '+'
//
// FIXME: We can reuse the expression comparator of the interop analysis after
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ziqingluo-90 Is the expression comparator of the interop analysis the same or different from what is done here?

I believe this expression comparison can result in exponential work in the size of an expression.. AreEquaIntegralBinaryOperator makes 2 calls to AreEqualIntegers per subexpression. AreEqualIntegers can call back into EqualIntegeralBinaryOperator on subexpressions of its argument.

If the plan is to eventually replace this code with the expression comparator from interop analysis, then this is not worth fixing now. However, if the expression comparator has the same issue or that is not the plan, then this should be fixed. It can be fixed be bounding the amount of work that is allowed, for example, by limiting the recursive depth that is allowed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comparator used in the interop analysis is more general. But it does not know anything about commutativity. So the complexity issue will still be there if we use the same algorithm as here. I plan to apply commutativity to only leaf nodes to fix the issue.

For associativity---yes, it's dangerous! So I do not plan to support it. I think my algorithm here does not introduce associativity otherwise I'd like to fix it right away.

// it has been upstreamed.
static bool areEqualIntegers(const Expr *E1, const Expr *E2, ASTContext &Ctx);
static bool areEqualIntegralBinaryOperators(const BinaryOperator *E1,
const Expr *E2_LHS,
BinaryOperatorKind BOP,
const Expr *E2_RHS,
ASTContext &Ctx) {
if (E1->getOpcode() == BOP) {
switch (BOP) {
// Commutative operators:
case BO_Mul:
case BO_Add:
return (areEqualIntegers(E1->getLHS(), E2_LHS, Ctx) &&
areEqualIntegers(E1->getRHS(), E2_RHS, Ctx)) ||
(areEqualIntegers(E1->getLHS(), E2_RHS, Ctx) &&
areEqualIntegers(E1->getRHS(), E2_LHS, Ctx));
default:
return false;
}
}
return false;
}

static bool areEqualIntegers(const Expr *E1, const Expr *E2, ASTContext &Ctx) {
E1 = E1->IgnoreParenImpCasts();
E2 = E2->IgnoreParenImpCasts();
if (!E1->getType()->isIntegerType() || E1->getType() != E2->getType())
return false;

Expr::EvalResult ER1, ER2;

// If both are constants:
if (E1->EvaluateAsInt(ER1, Ctx) &&
E2->EvaluateAsInt(ER2, Ctx))
return ER1.Val.getInt() == ER2.Val.getInt();

// Otherwise, they should have identical stmt kind:
if (E1->getStmtClass() != E2->getStmtClass())
return false;
switch (E1->getStmtClass()) {
case Stmt::DeclRefExprClass:
return cast<DeclRefExpr>(E1)->getDecl() == cast<DeclRefExpr>(E2)->getDecl();
case Stmt::BinaryOperatorClass: {
auto BO2 = cast<BinaryOperator>(E2);
return areEqualIntegralBinaryOperators(cast<BinaryOperator>(E1),
BO2->getLHS(), BO2->getOpcode(),
BO2->getRHS(), Ctx);
}
default:
return false;
}
}

// Given a two-param std::span construct call, matches iff the call has the
// following forms:
// 1. `std::span<T>{new T[n], n}`, where `n` is a literal or a DRE
// 2. `std::span<T>{new T, 1}`
// 3. `std::span<T>{&var, 1}`
// 3. `std::span<T>{&var, 1}` or `std::span<T>{std::addressof(...), 1}`
// 4. `std::span<T>{a, n}`, where `a` is of an array-of-T with constant size
// `n`
// 5. `std::span<T>{any, 0}`
// 6. `std::span<T>{std::addressof(...), 1}`
// 6. `std::span<T>{ (char *)f(args), args[N] * arg*[M]}`, where
// `f` is a function with attribute `alloc_size(N, M)`;
// `args` represents the list of arguments;
// `N, M` are parameter indexes to the allocating element number and size.
// Sometimes, there is only one parameter index representing the total
// size.
AST_MATCHER(CXXConstructExpr, isSafeSpanTwoParamConstruct) {
assert(Node.getNumArgs() == 2 &&
"expecting a two-parameter std::span constructor");
const Expr *Arg0 = Node.getArg(0)->IgnoreImplicit();
const Expr *Arg1 = Node.getArg(1)->IgnoreImplicit();
auto HaveEqualConstantValues = [&Finder](const Expr *E0, const Expr *E1) {
if (auto E0CV = E0->getIntegerConstantExpr(Finder->getASTContext()))
if (auto E1CV = E1->getIntegerConstantExpr(Finder->getASTContext())) {
const Expr *Arg0 = Node.getArg(0)->IgnoreParenImpCasts();
const Expr *Arg1 = Node.getArg(1)->IgnoreParenImpCasts();
ASTContext &Ctx = Finder->getASTContext();

auto HaveEqualConstantValues = [&Ctx](const Expr *E0, const Expr *E1) {
if (auto E0CV = E0->getIntegerConstantExpr(Ctx))
if (auto E1CV = E1->getIntegerConstantExpr(Ctx)) {
return APSInt::compareValues(*E0CV, *E1CV) == 0;
}
return false;
Expand All @@ -381,13 +450,14 @@ AST_MATCHER(CXXConstructExpr, isSafeSpanTwoParamConstruct) {
}
return false;
};
std::optional<APSInt> Arg1CV =
Arg1->getIntegerConstantExpr(Finder->getASTContext());
std::optional<APSInt> Arg1CV = Arg1->getIntegerConstantExpr(Ctx);

if (Arg1CV && Arg1CV->isZero())
// Check form 5:
return true;
switch (Arg0->IgnoreImplicit()->getStmtClass()) {

// Check forms 1-3:
switch (Arg0->getStmtClass()) {
case Stmt::CXXNewExprClass:
if (auto Size = cast<CXXNewExpr>(Arg0)->getArraySize()) {
// Check form 1:
Expand All @@ -407,6 +477,7 @@ AST_MATCHER(CXXConstructExpr, isSafeSpanTwoParamConstruct) {
return Arg1CV && Arg1CV->isOne();
break;
case Stmt::CallExprClass:
// Check form 3:
if (const auto *CE = dyn_cast<CallExpr>(Arg0)) {
const auto FnDecl = CE->getDirectCallee();
if (FnDecl && FnDecl->getNameAsString() == "addressof" &&
Expand All @@ -421,13 +492,41 @@ AST_MATCHER(CXXConstructExpr, isSafeSpanTwoParamConstruct) {

QualType Arg0Ty = Arg0->IgnoreImplicit()->getType();

if (auto *ConstArrTy =
Finder->getASTContext().getAsConstantArrayType(Arg0Ty)) {
if (auto *ConstArrTy = Ctx.getAsConstantArrayType(Arg0Ty)) {
const APSInt ConstArrSize = APSInt(ConstArrTy->getSize());

// Check form 4:
return Arg1CV && APSInt::compareValues(ConstArrSize, *Arg1CV) == 0;
}
// Check form 6:
if (auto CCast = dyn_cast<CStyleCastExpr>(Arg0)) {
if (!CCast->getType()->isPointerType())
return false;

QualType PteTy = CCast->getType()->getPointeeType();

if (!(PteTy->isConstantSizeType() && Ctx.getTypeSizeInChars(PteTy).isOne()))
return false;

if (const auto *Call = dyn_cast<CallExpr>(CCast->getSubExpr())) {
if (const FunctionDecl *FD = Call->getDirectCallee())
if (auto *AllocAttr = FD->getAttr<AllocSizeAttr>()) {
const Expr *EleSizeExpr =
Call->getArg(AllocAttr->getElemSizeParam().getASTIndex());
// NumElemIdx is invalid if AllocSizeAttr has 1 argument:
ParamIdx NumElemIdx = AllocAttr->getNumElemsParam();

if (!NumElemIdx.isValid())
return areEqualIntegers(Arg1, EleSizeExpr, Ctx);

const Expr *NumElesExpr = Call->getArg(NumElemIdx.getASTIndex());

if (auto BO = dyn_cast<BinaryOperator>(Arg1))
return areEqualIntegralBinaryOperators(BO, NumElesExpr, BO_Mul,
EleSizeExpr, Ctx);
}
}
}
return false;
}

Expand Down
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens in cases where the expression are equivalent but are composed differently. For instance, my_alloc( x + y + z), z + y + x}). Can you add this test case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. It cannot figure out this case. It's too complicated for the simple expression comparison function.
I add this test as unsupported.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it is fine to not handle this for now.

If we do decide to handle this, we need to be careful because this equivalence does not hold for signed integer arithmetic expressions in C. The equivalence relies on commutativity and associativity of integer arithmetic. Reassociation can change whether overflow occurs for integer expressions at runtime and signed integer overflow is undefined behavior in C. If, however, we assume that signed integer arithmetic is 2's complement arithmetic (which can be specified via a compiler flag), then this equivalence holds for signed integer expressions.

Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ namespace std {
_Tp* addressof(_Tp& __x) {
return &__x;
}

}

namespace irrelevant_constructors {
Expand Down Expand Up @@ -154,6 +154,44 @@ namespace construct_wt_begin_end {
}
} // namespace construct_wt_begin_end

namespace test_alloc_size_attr {
void * my_alloc(unsigned size) __attribute__((alloc_size(1)));
void * my_alloc2(unsigned count, unsigned size) __attribute__((alloc_size(1,2)));

void safe(int x, unsigned y) {
std::span<char>{(char *)my_alloc(10), 10};
std::span<char>{(char *)my_alloc(x), x};
std::span<char>{(char *)my_alloc(x * y), x * y};
std::span<char>{(char *)my_alloc(x * y), y * x};
std::span<char>{(char *)my_alloc(x * y + x), x * y + x};
std::span<char>{(char *)my_alloc(x * y + x), x + y * x};

std::span<char>{(char *)my_alloc2(x, y), x * y};
std::span<char>{(char *)my_alloc2(x, y), y * x};
//foo(std::span<char>{(char *)my_alloc2(x, sizeof(char)), x}); // lets not worry about this case for now
std::span<char>{(char *)my_alloc2(x, sizeof(char)), x * sizeof(char)};
//foo(std::span<char>{(char *)my_alloc2(10, sizeof(char)), 10});
std::span<char>{(char *)my_alloc2(10, sizeof(char)), 10 * sizeof(char)};
}

void unsafe(int x, int y) {
std::span<char>{(char *)my_alloc(10), 11}; // expected-warning{{the two-parameter std::span construction is unsafe as it can introduce mismatch between buffer size and the bound information}}
std::span<char>{(char *)my_alloc(x * y), x + y}; // expected-warning{{the two-parameter std::span construction is unsafe as it can introduce mismatch between buffer size and the bound information}}
std::span<int>{(int *)my_alloc(x), x}; // expected-warning{{the two-parameter std::span construction is unsafe as it can introduce mismatch between buffer size and the bound information}}
std::span<char>{(char *)my_alloc2(x, y), x + y}; // expected-warning{{the two-parameter std::span construction is unsafe as it can introduce mismatch between buffer size and the bound information}}
}

void unsupport(int x, int y, int z) {
// Casting to `T*` where sizeof(T) > 1 is not supported yet:
std::span<int>{(int *)my_alloc2(x, y), x * y}; // expected-warning{{the two-parameter std::span construction is unsafe as it can introduce mismatch between buffer size and the bound information}}
std::span<long>{(long *)my_alloc(10 * sizeof(long)), 10}; // expected-warning{{the two-parameter std::span construction is unsafe as it can introduce mismatch between buffer size and the bound information}}
std::span<long>{(long *)my_alloc2(x, sizeof(long)), x}; // expected-warning{{the two-parameter std::span construction is unsafe as it can introduce mismatch between buffer size and the bound information}}
std::span<long>{(long *)my_alloc2(x, sizeof(long)), x}; // expected-warning{{the two-parameter std::span construction is unsafe as it can introduce mismatch between buffer size and the bound information}}
// The expression is too complicated:
std::span<char>{(char *)my_alloc(x + y + z), z + y + x}; // expected-warning{{the two-parameter std::span construction is unsafe as it can introduce mismatch between buffer size and the bound information}}
}
}

namespace test_flag {
void f(int *p) {
#pragma clang diagnostic push
Expand Down
Loading