Skip to content

Implement src:*=sanitize for UBSan #139772

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion clang/lib/Basic/NoSanitizeList.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,8 @@ bool NoSanitizeList::containsFunction(SanitizerMask Mask,

bool NoSanitizeList::containsFile(SanitizerMask Mask, StringRef FileName,
StringRef Category) const {
return SSCL->inSection(Mask, "src", FileName, Category);
bool allowList = SSCL->inSection(Mask, "src", FileName, "sanitize");
return SSCL->inSection(Mask, "src", FileName, Category) && !allowList;
}

bool NoSanitizeList::containsMainFile(SanitizerMask Mask, StringRef FileName,
Expand Down
32 changes: 32 additions & 0 deletions clang/test/CodeGen/ubsan-src-ignorelist-category.test
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
// RUN: rm -rf %t
// RUN: split-file %s %t
// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist -emit-llvm %t/test1.c -o - | FileCheck %s -check-prefix=CHECK-ALLOWLIST
// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist -emit-llvm %t/test2.c -o - | FileCheck %s -check-prefix=CHECK-IGNORELIST
// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist.contradict -emit-llvm %t/test1.c -o - | FileCheck %s -check-prefix=CHECK-ALLOWLIST
// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist.contradict -emit-llvm %t/test2.c -o - | FileCheck %s -check-prefix=CHECK-IGNORELIST


// Verify ubsan only emits checks for files in the allowlist

//--- src.ignorelist
src:*
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we have a problem with:

src:*
src:*/mylib/*=sanitize
src:*/mylib/test.cc

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I have some ideas how to fix this:

  1. Right now order does not matter, if we say that it's matter now, we don't break existing stuff
  2. Let's make =sanitize reset all above and =
  3. So inSection need to return linenum in file which will compare with linenum of =sanitize. There is a version of function for that

Problem: SanitizerSpecialCaseList tracks linenum, but if you check parsing code, it can represent multiple files!

I suggest to start with updating SpecialCaseList to track (fileindex, linenum).

But, SpecialCaseList::Sections for whatever reasons is StringMap,
so it's ordered by name, and it's not used anyway.

Let's first PR to make SpecialCaseList::Sections -> vector

similar SanitizerSpecialCaseList?

Copy link
Member Author

@qinkunbao qinkunbao May 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we have a problem with:

src:*
src:*/mylib/*=sanitize
src:*/mylib/test.cc

For this example, I think /mylib/test.cc will be instrumented because the file /mylib/test.cc will fall into two categories (= and =sanitize) with this PR. And it should follow the =sanitizer category. I think it should match the current doc. (I have updated the test to cover the situation. )

I was a little hesitated about the change during implementing the feature. For example, if we have a ignore list file

type:int
src:test.cc=sanitize

If a file test.cc has the type int, shall we still instrument the type int variable in the file test.cc?

On the other hand,

src:test.cc
type:int=sanitize

Shall we still instrument the type int variable in the file test.cc?

I think the both answers are yes. Let me know if it makes sense to you.

I will create a new PR with using user branches in llvm/llvm-project to update SpecialCaseList::Sections. Thank you for the feedback.

Copy link
Member Author

@qinkunbao qinkunbao May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But, SpecialCaseList::Sections for whatever reasons is StringMap,
so it's ordered by name, and it's not used anyway.

Oh, I find the key of the StringMap Sections is actually used. Here is the Link. Here are some options. Let me know which option is better.

  1. How about creating a vector<Section*> to track the order?
StringMap<Section> Sections;
vector<Section*> SectionsVec;

  1. We can use the key from Globs
    auto [It, DidEmplace] = Globs.try_emplace(Pattern);

src:*/test1.c=sanitize

//--- src.ignorelist.contradict
src:*
src:*/test1.c=sanitize
src:*/test1.c

//--- test1.c
int add1(int a, int b) {
// CHECK-ALLOWLIST: llvm.sadd.with.overflow.i32
return a+b;
}

//--- test2.c
int add2(int a, int b) {
// CHECK-IGNORELIST-NOT: llvm.sadd.with.overflow.i32
return a+b;
}