Skip to content

Commit 038c48c

Browse files
authored
[TBAA] Emit distinct TBAA tags for pointers with different depths,types. (#76612)
This patch extends Clang's TBAA generation code to emit distinct tags for incompatible pointer types. Pointers with different element types are incompatible if the pointee types are also incompatible (modulo sugar/modifiers). Express this in TBAA by generating different tags for pointers based on the pointer depth and pointee type. To get the TBAA tag for the pointee type it uses getTypeInfoHelper on the pointee type. (Moved from https://reviews.llvm.org/D122573) PR: #76612
1 parent 5b82741 commit 038c48c

File tree

9 files changed

+565
-472
lines changed

9 files changed

+565
-472
lines changed

clang/docs/ReleaseNotes.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -425,6 +425,10 @@ Non-comprehensive list of changes in this release
425425
- Added support for ``TypeLoc::dump()`` for easier debugging, and improved
426426
textual and JSON dumping for various ``TypeLoc``-related nodes.
427427

428+
- Clang can now emit distinct type-based alias analysis tags for incompatible
429+
pointers, enabling more powerful alias analysis when accessing pointer types.
430+
The new behavior can be enabled using ``-fpointer-tbaa``.
431+
428432
New Compiler Flags
429433
------------------
430434
- ``-fsanitize=implicit-bitfield-conversion`` checks implicit truncation and
@@ -467,6 +471,9 @@ New Compiler Flags
467471
- For the ARM target, added ``-Warm-interrupt-vfp-clobber`` that will emit a
468472
diagnostic when an interrupt handler is declared and VFP is enabled.
469473

474+
- ``-fpointer-tbaa`` enables emission of distinct type-based alias
475+
analysis tags for incompatible pointers.
476+
470477
Deprecated Compiler Flags
471478
-------------------------
472479

clang/include/clang/Basic/CodeGenOptions.def

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -233,6 +233,7 @@ ENUM_CODEGENOPT(StructReturnConvention, StructReturnConventionKind, 2, SRCK_Defa
233233

234234
CODEGENOPT(RelaxAll , 1, 0) ///< Relax all machine code instructions.
235235
CODEGENOPT(RelaxedAliasing , 1, 0) ///< Set when -fno-strict-aliasing is enabled.
236+
CODEGENOPT(PointerTBAA, 1, 0) ///< Whether or not to use distinct TBAA tags for pointers.
236237
CODEGENOPT(StructPathTBAA , 1, 0) ///< Whether or not to use struct-path TBAA.
237238
CODEGENOPT(NewStructPathTBAA , 1, 0) ///< Whether or not to use enhanced struct-path TBAA.
238239
CODEGENOPT(SaveTempLabels , 1, 0) ///< Save temporary labels.

clang/include/clang/Driver/Options.td

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3379,6 +3379,7 @@ def fstruct_path_tbaa : Flag<["-"], "fstruct-path-tbaa">, Group<f_Group>;
33793379
def fno_struct_path_tbaa : Flag<["-"], "fno-struct-path-tbaa">, Group<f_Group>;
33803380
def fno_strict_enums : Flag<["-"], "fno-strict-enums">, Group<f_Group>;
33813381
def fno_strict_overflow : Flag<["-"], "fno-strict-overflow">, Group<f_Group>;
3382+
def fno_pointer_tbaa : Flag<["-"], "fno-pointer-tbaa">, Group<f_Group>;
33823383
def fno_temp_file : Flag<["-"], "fno-temp-file">, Group<f_Group>,
33833384
Visibility<[ClangOption, CC1Option, CLOption, DXCOption]>, HelpText<
33843385
"Directly create compilation output files. This may lead to incorrect incremental builds if the compiler crashes">,
@@ -3883,6 +3884,7 @@ defm strict_vtable_pointers : BoolFOption<"strict-vtable-pointers",
38833884
" overwriting polymorphic C++ objects">,
38843885
NegFlag<SetFalse>>;
38853886
def fstrict_overflow : Flag<["-"], "fstrict-overflow">, Group<f_Group>;
3887+
def fpointer_tbaa : Flag<["-"], "fpointer-tbaa">, Group<f_Group>;
38863888
def fdriver_only : Flag<["-"], "fdriver-only">, Flags<[NoXarchOption]>,
38873889
Visibility<[ClangOption, CLOption, DXCOption]>,
38883890
Group<Action_Group>, HelpText<"Only run the driver.">;
@@ -7146,6 +7148,9 @@ def fuse_register_sized_bitfield_access: Flag<["-"], "fuse-register-sized-bitfie
71467148
def relaxed_aliasing : Flag<["-"], "relaxed-aliasing">,
71477149
HelpText<"Turn off Type Based Alias Analysis">,
71487150
MarshallingInfoFlag<CodeGenOpts<"RelaxedAliasing">>;
7151+
def pointer_tbaa: Flag<["-"], "pointer-tbaa">,
7152+
HelpText<"Turn on Type Based Alias Analysis for pointer accesses">,
7153+
MarshallingInfoFlag<CodeGenOpts<"PointerTBAA">>;
71497154
def no_struct_path_tbaa : Flag<["-"], "no-struct-path-tbaa">,
71507155
HelpText<"Turn off struct-path aware Type Based Alias Analysis">,
71517156
MarshallingInfoNegativeFlag<CodeGenOpts<"StructPathTBAA">>;

clang/lib/CodeGen/CodeGenTBAA.cpp

Lines changed: 50 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -185,10 +185,56 @@ llvm::MDNode *CodeGenTBAA::getTypeInfoHelper(const Type *Ty) {
185185
return getChar();
186186

187187
// Handle pointers and references.
188-
// TODO: Implement C++'s type "similarity" and consider dis-"similar"
189-
// pointers distinct.
190-
if (Ty->isPointerType() || Ty->isReferenceType())
191-
return createScalarTypeNode("any pointer", getChar(), Size);
188+
//
189+
// C has a very strict rule for pointer aliasing. C23 6.7.6.1p2:
190+
// For two pointer types to be compatible, both shall be identically
191+
// qualified and both shall be pointers to compatible types.
192+
//
193+
// This rule is impractically strict; we want to at least ignore CVR
194+
// qualifiers. Distinguishing by CVR qualifiers would make it UB to
195+
// e.g. cast a `char **` to `const char * const *` and dereference it,
196+
// which is too common and useful to invalidate. C++'s similar types
197+
// rule permits qualifier differences in these nested positions; in fact,
198+
// C++ even allows that cast as an implicit conversion.
199+
//
200+
// Other qualifiers could theoretically be distinguished, especially if
201+
// they involve a significant representation difference. We don't
202+
// currently do so, however.
203+
//
204+
// Computing the pointee type string recursively is implicitly more
205+
// forgiving than the standards require. Effectively, we are turning
206+
// the question "are these types compatible/similar" into "are
207+
// accesses to these types allowed to alias". In both C and C++,
208+
// the latter question has special carve-outs for signedness
209+
// mismatches that only apply at the top level. As a result, we are
210+
// allowing e.g. `int *` l-values to access `unsigned *` objects.
211+
if (Ty->isPointerType() || Ty->isReferenceType()) {
212+
llvm::MDNode *AnyPtr = createScalarTypeNode("any pointer", getChar(), Size);
213+
if (!CodeGenOpts.PointerTBAA)
214+
return AnyPtr;
215+
// Compute the depth of the pointer and generate a tag of the form "p<depth>
216+
// <base type tag>".
217+
unsigned PtrDepth = 0;
218+
do {
219+
PtrDepth++;
220+
Ty = Ty->getPointeeType().getTypePtr();
221+
} while (Ty->isPointerType());
222+
// TODO: Implement C++'s type "similarity" and consider dis-"similar"
223+
// pointers distinct for non-builtin types.
224+
if (isa<BuiltinType>(Ty)) {
225+
llvm::MDNode *ScalarMD = getTypeInfoHelper(Ty);
226+
StringRef Name =
227+
cast<llvm::MDString>(
228+
ScalarMD->getOperand(CodeGenOpts.NewStructPathTBAA ? 2 : 0))
229+
->getString();
230+
SmallString<256> OutName("p");
231+
OutName += std::to_string(PtrDepth);
232+
OutName += " ";
233+
OutName += Name;
234+
return createScalarTypeNode(OutName, AnyPtr, Size);
235+
}
236+
return AnyPtr;
237+
}
192238

193239
// Accesses to arrays are accesses to objects of their element types.
194240
if (CodeGenOpts.NewStructPathTBAA && Ty->isArrayType())

clang/lib/Driver/ToolChains/Clang.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5721,6 +5721,9 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
57215721
if (!Args.hasFlag(options::OPT_fstrict_aliasing, StrictAliasingAliasOption,
57225722
options::OPT_fno_strict_aliasing, !IsWindowsMSVC))
57235723
CmdArgs.push_back("-relaxed-aliasing");
5724+
if (Args.hasFlag(options::OPT_fpointer_tbaa, options::OPT_fno_pointer_tbaa,
5725+
false))
5726+
CmdArgs.push_back("-pointer-tbaa");
57245727
if (!Args.hasFlag(options::OPT_fstruct_path_tbaa,
57255728
options::OPT_fno_struct_path_tbaa, true))
57265729
CmdArgs.push_back("-no-struct-path-tbaa");

clang/test/CodeGen/sanitize-metadata-nosanitize.c

Lines changed: 16 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
//.
1313
// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(write, argmem: none, inaccessiblemem: none)
1414
// CHECK-LABEL: define dso_local void @escape
15-
// CHECK-SAME: (ptr noundef [[P:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] !pcsections !2 {
15+
// CHECK-SAME: (ptr noundef [[P:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] !pcsections [[META2:![0-9]+]] {
1616
// CHECK-NEXT: entry:
1717
// CHECK-NEXT: ret void
1818
//
@@ -23,13 +23,13 @@ __attribute__((noinline, not_tail_called)) void escape(const volatile void *p) {
2323

2424
// CHECK: Function Attrs: mustprogress nofree norecurse nounwind willreturn memory(write, argmem: readwrite, inaccessiblemem: none)
2525
// CHECK-LABEL: define dso_local i32 @normal_function
26-
// CHECK-SAME: (ptr noundef [[X:%.*]], ptr nocapture noundef readonly [[Y:%.*]]) local_unnamed_addr #[[ATTR1:[0-9]+]] !pcsections !4 {
26+
// CHECK-SAME: (ptr noundef [[X:%.*]], ptr nocapture noundef readonly [[Y:%.*]]) local_unnamed_addr #[[ATTR1:[0-9]+]] !pcsections [[META4:![0-9]+]] {
2727
// CHECK-NEXT: entry:
2828
// CHECK-NEXT: [[X_ADDR:%.*]] = alloca ptr, align 8
2929
// CHECK-NEXT: store ptr [[X]], ptr [[X_ADDR]], align 8, !tbaa [[TBAA6:![0-9]+]]
30-
// CHECK-NEXT: store atomic i32 1, ptr [[X]] monotonic, align 4, !pcsections !10
30+
// CHECK-NEXT: store atomic i32 1, ptr [[X]] monotonic, align 4, !pcsections [[META11:![0-9]+]]
3131
// CHECK-NEXT: notail call void @escape(ptr noundef nonnull [[X_ADDR]])
32-
// CHECK-NEXT: [[TMP0:%.*]] = load i32, ptr [[Y]], align 4, !tbaa [[TBAA11:![0-9]+]]
32+
// CHECK-NEXT: [[TMP0:%.*]] = load i32, ptr [[Y]], align 4, !tbaa [[TBAA12:![0-9]+]]
3333
// CHECK-NEXT: ret i32 [[TMP0]]
3434
//
3535
int normal_function(int *x, int *y) {
@@ -46,7 +46,7 @@ int normal_function(int *x, int *y) {
4646
// CHECK-NEXT: store ptr [[X]], ptr [[X_ADDR]], align 8, !tbaa [[TBAA6]]
4747
// CHECK-NEXT: store atomic i32 1, ptr [[X]] monotonic, align 4
4848
// CHECK-NEXT: notail call void @escape(ptr noundef nonnull [[X_ADDR]])
49-
// CHECK-NEXT: [[TMP0:%.*]] = load i32, ptr [[Y]], align 4, !tbaa [[TBAA11]]
49+
// CHECK-NEXT: [[TMP0:%.*]] = load i32, ptr [[Y]], align 4, !tbaa [[TBAA12]]
5050
// CHECK-NEXT: ret i32 [[TMP0]]
5151
//
5252
__attribute__((disable_sanitizer_instrumentation)) int test_disable_sanitize_instrumentation(int *x, int *y) {
@@ -57,13 +57,13 @@ __attribute__((disable_sanitizer_instrumentation)) int test_disable_sanitize_ins
5757

5858
// CHECK: Function Attrs: mustprogress nofree norecurse nounwind willreturn memory(write, argmem: readwrite, inaccessiblemem: none)
5959
// CHECK-LABEL: define dso_local i32 @test_no_sanitize_thread
60-
// CHECK-SAME: (ptr noundef [[X:%.*]], ptr nocapture noundef readonly [[Y:%.*]]) local_unnamed_addr #[[ATTR3:[0-9]+]] !pcsections !13 {
60+
// CHECK-SAME: (ptr noundef [[X:%.*]], ptr nocapture noundef readonly [[Y:%.*]]) local_unnamed_addr #[[ATTR3:[0-9]+]] !pcsections [[META14:![0-9]+]] {
6161
// CHECK-NEXT: entry:
6262
// CHECK-NEXT: [[X_ADDR:%.*]] = alloca ptr, align 8
6363
// CHECK-NEXT: store ptr [[X]], ptr [[X_ADDR]], align 8, !tbaa [[TBAA6]]
64-
// CHECK-NEXT: store atomic i32 1, ptr [[X]] monotonic, align 4, !pcsections !10
64+
// CHECK-NEXT: store atomic i32 1, ptr [[X]] monotonic, align 4, !pcsections [[META11]]
6565
// CHECK-NEXT: notail call void @escape(ptr noundef nonnull [[X_ADDR]])
66-
// CHECK-NEXT: [[TMP0:%.*]] = load i32, ptr [[Y]], align 4, !tbaa [[TBAA11]]
66+
// CHECK-NEXT: [[TMP0:%.*]] = load i32, ptr [[Y]], align 4, !tbaa [[TBAA12]]
6767
// CHECK-NEXT: ret i32 [[TMP0]]
6868
//
6969
__attribute__((no_sanitize("thread"))) int test_no_sanitize_thread(int *x, int *y) {
@@ -74,13 +74,13 @@ __attribute__((no_sanitize("thread"))) int test_no_sanitize_thread(int *x, int *
7474

7575
// CHECK: Function Attrs: mustprogress nofree norecurse nounwind willreturn memory(write, argmem: readwrite, inaccessiblemem: none)
7676
// CHECK-LABEL: define dso_local i32 @test_no_sanitize_all
77-
// CHECK-SAME: (ptr noundef [[X:%.*]], ptr nocapture noundef readonly [[Y:%.*]]) local_unnamed_addr #[[ATTR3]] !pcsections !13 {
77+
// CHECK-SAME: (ptr noundef [[X:%.*]], ptr nocapture noundef readonly [[Y:%.*]]) local_unnamed_addr #[[ATTR3]] !pcsections [[META14]] {
7878
// CHECK-NEXT: entry:
7979
// CHECK-NEXT: [[X_ADDR:%.*]] = alloca ptr, align 8
8080
// CHECK-NEXT: store ptr [[X]], ptr [[X_ADDR]], align 8, !tbaa [[TBAA6]]
81-
// CHECK-NEXT: store atomic i32 1, ptr [[X]] monotonic, align 4, !pcsections !10
81+
// CHECK-NEXT: store atomic i32 1, ptr [[X]] monotonic, align 4, !pcsections [[META11]]
8282
// CHECK-NEXT: notail call void @escape(ptr noundef nonnull [[X_ADDR]])
83-
// CHECK-NEXT: [[TMP0:%.*]] = load i32, ptr [[Y]], align 4, !tbaa [[TBAA11]]
83+
// CHECK-NEXT: [[TMP0:%.*]] = load i32, ptr [[Y]], align 4, !tbaa [[TBAA12]]
8484
// CHECK-NEXT: ret i32 [[TMP0]]
8585
//
8686
__attribute__((no_sanitize("all"))) int test_no_sanitize_all(int *x, int *y) {
@@ -89,23 +89,9 @@ __attribute__((no_sanitize("all"))) int test_no_sanitize_all(int *x, int *y) {
8989
return *y;
9090
}
9191
//.
92-
// CHECK: attributes #0 = { mustprogress nofree noinline norecurse nosync nounwind willreturn memory(write, argmem: none, inaccessiblemem: none) "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+cx8,+mmx,+sse,+sse2,+x87" }
93-
// CHECK: attributes #1 = { mustprogress nofree norecurse nounwind willreturn memory(write, argmem: readwrite, inaccessiblemem: none) "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+cx8,+mmx,+sse,+sse2,+x87" }
94-
// CHECK: attributes #2 = { disable_sanitizer_instrumentation mustprogress nofree norecurse nounwind willreturn memory(write, argmem: readwrite, inaccessiblemem: none) "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+cx8,+mmx,+sse,+sse2,+x87" }
95-
// CHECK: attributes #3 = { mustprogress nofree norecurse nounwind willreturn memory(write, argmem: readwrite, inaccessiblemem: none) "min-legal-vector-width"="0" "no-trapping-math"="true" "no_sanitize_thread" "stack-protector-buffer-size"="8" "target-features"="+cx8,+mmx,+sse,+sse2,+x87" }
96-
// CHECK: attributes #4 = { nounwind "target-features"="+cx8,+mmx,+sse,+sse2,+x87" }
97-
//.
98-
// CHECK: !2 = !{!"sanmd_covered2!C", !3}
99-
// CHECK: !3 = !{i64 0}
100-
// CHECK: !4 = !{!"sanmd_covered2!C", !5}
101-
// CHECK: !5 = !{i64 3}
102-
// CHECK: !6 = !{!7, !7, i64 0}
103-
// CHECK: !7 = !{!"any pointer", !8, i64 0}
104-
// CHECK: !8 = !{!"omnipotent char", !9, i64 0}
105-
// CHECK: !9 = !{!"Simple C/C++ TBAA"}
106-
// CHECK: !10 = !{!"sanmd_atomics2!C"}
107-
// CHECK: !11 = !{!12, !12, i64 0}
108-
// CHECK: !12 = !{!"int", !8, i64 0}
109-
// CHECK: !13 = !{!"sanmd_covered2!C", !14}
110-
// CHECK: !14 = !{i64 2}
92+
// CHECK: attributes #[[ATTR0]] = { mustprogress nofree noinline norecurse nosync nounwind willreturn memory(write, argmem: none, inaccessiblemem: none) "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+cx8,+mmx,+sse,+sse2,+x87" }
93+
// CHECK: attributes #[[ATTR1]] = { mustprogress nofree norecurse nounwind willreturn memory(write, argmem: readwrite, inaccessiblemem: none) "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+cx8,+mmx,+sse,+sse2,+x87" }
94+
// CHECK: attributes #[[ATTR2]] = { disable_sanitizer_instrumentation mustprogress nofree norecurse nounwind willreturn memory(write, argmem: readwrite, inaccessiblemem: none) "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+cx8,+mmx,+sse,+sse2,+x87" }
95+
// CHECK: attributes #[[ATTR3]] = { mustprogress nofree norecurse nounwind willreturn memory(write, argmem: readwrite, inaccessiblemem: none) "min-legal-vector-width"="0" "no-trapping-math"="true" "no_sanitize_thread" "stack-protector-buffer-size"="8" "target-features"="+cx8,+mmx,+sse,+sse2,+x87" }
96+
// CHECK: attributes #[[ATTR4:[0-9]+]] = { nounwind "target-features"="+cx8,+mmx,+sse,+sse2,+x87" }
11197
//.

0 commit comments

Comments
 (0)