Skip to content

[TBAA] Don't emit pointer-tbaa for void pointers. #122116

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jan 31, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
88 changes: 79 additions & 9 deletions clang/docs/UsersManual.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2489,6 +2489,82 @@ are listed below.

$ clang -fuse-ld=lld -Oz -Wl,--icf=safe -fcodegen-data-use code.cc

.. _strict_aliasing:

Strict Aliasing
---------------

The C and C++ standards require accesses to objects in memory to use l-values of
an appropriate type for the object. This is called *strict aliasing* or
*type-based alias analysis*. Strict aliasing enhances a variety of powerful
memory optimizations, including reordering, combining, and eliminating memory
accesses. These optimizations can lead to unexpected behavior in code that
violates the strict aliasing rules. For example:

.. code-block:: c++

void advance(size_t *index, double *data) {
double value = data[*index];
/* Clang may assume that this store does not change the contents of `data`. */
*index += 1;
/* Clang may assume that this store does not change the contents of `index`. */
data[*index] = value;
/* Either of these facts may create significant optimization opportunities
if Clang is able to inline this function. */
}

Strict aliasing can be explicitly enabled with ``-fstrict-aliasing`` and
disabled with ``-fno-strict-aliasing``. ``clang-cl`` defaults to
``-fno-strict-aliasing``; see . Otherwise, Clang defaults to ``-fstrict-aliasing``.

C and C++ specify slightly different rules for strict aliasing. To improve
language interoperability, Clang allows two types to alias if either language
would permit it. This includes applying the C++ similar types rule to C,
allowing ``int **`` to alias ``int const * const *``. Clang also relaxes the
standard aliasing rules in the following ways:

* All integer types of the same size are permitted to alias each other,
including signed and unsigned types.
* ``void*`` is permitted to alias any pointer type, ``void**`` is permitted to
alias any pointer to pointer type, and so on.

Code which violates strict aliasing has undefined behavior. A program that
works in one version of Clang may not work in another because of changes to the
optimizer. Clang provides a :doc:`TypeSanitizer` to help detect
violations of the strict aliasing rules, but it is currently still experimental.
Code that is known to violate strict aliasing should generally be built with
``-fno-strict-aliasing`` if the violation cannot be fixed.

Clang supports several ways to fix a violation of strict aliasing:

* L-values of the character types ``char`` and ``unsigned char`` (as well as
other types, depending on the standard) are permitted to access objects of
any type.

* Library functions such as ``memcpy`` and ``memset`` are specified as treating
memory as characters and therefore are not limited by strict aliasing. If a
value of one type must be reinterpreted as another (e.g. to read the bits of a
floating-point number), use ``memcpy`` to copy the representation to an object
of the destination type. This has no overhead over a direct l-value access
because Clang should reliably optimize calls to these functions to use simple
loads and stores when they are used with small constant sizes.

* The attribute ``may_alias`` can be added to a ``typedef`` to give l-values of
that type the same aliasing power as the character types.

Clang makes a best effort to avoid obvious miscompilations from strict aliasing
by only considering type information when it cannot prove that two accesses must
refer to the same memory. However, it is not recommended that programmers
intentionally rely on this instead of using one of the solutions above because
it is too easy for the compiler's analysis to be blocked in surprising ways.

In Clang 20, Clang strengthened its implementation of strict aliasing for
accesses of pointer type. Previously, all accesses of pointer type were
permitted to alias each other, but Clang now distinguishes different pointers
by their pointee type, except as limited by the relaxations around qualifiers
and ``void*`` described above. The previous behavior of treating all pointers as
aliasing can be restored using ``-fno-pointer-tbaa``.

Profile Guided Optimization
---------------------------

Expand Down Expand Up @@ -5272,12 +5348,6 @@ The Visual C++ Toolset has a slightly more elaborate mechanism for detection.
Restrictions and Limitations compared to Clang
----------------------------------------------

Strict Aliasing
^^^^^^^^^^^^^^^

Strict aliasing (TBAA) is always off by default in clang-cl. Whereas in clang,
strict aliasing is turned on by default for all optimization levels.

To enable LLVM optimizations based on strict aliasing rules (e.g., optimizations
based on type of expressions in C/C++), user will need to explicitly pass
`-fstrict-aliasing` to clang-cl.
Strict aliasing (TBAA) is always off by default in clang-cl whereas in clang,
strict aliasing is turned on by default for all optimization levels. For more
details, see :ref:`Strict aliasing <strict_aliasing>`.
8 changes: 8 additions & 0 deletions clang/lib/CodeGen/CodeGenTBAA.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -226,6 +226,14 @@ llvm::MDNode *CodeGenTBAA::getTypeInfoHelper(const Type *Ty) {
PtrDepth++;
Ty = Ty->getPointeeType()->getBaseElementTypeUnsafe();
} while (Ty->isPointerType());

// While there are no special rules in the standards regarding void pointers
// and strict aliasing, emitting distinct tags for void pointers break some
// common idioms and there is no good alternative to re-write the code
// without strict-aliasing violations.
if (Ty->isVoidType())
return AnyPtr;

assert(!isa<VariableArrayType>(Ty));
// When the underlying type is a builtin type, we compute the pointee type
// string recursively, which is implicitly more forgiving than the standards
Expand Down
13 changes: 3 additions & 10 deletions clang/test/CodeGen/tbaa-pointers.c
Original file line number Diff line number Diff line change
Expand Up @@ -208,12 +208,9 @@ int void_ptrs(void **ptr) {
// COMMON-LABEL: define i32 @void_ptrs(
// COMMON-SAME: ptr noundef [[PTRA:%.+]])
// COMMON: [[PTR_ADDR:%.+]] = alloca ptr, align 8
// DISABLE-NEXT: store ptr [[PTRA]], ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR]]
// DISABLE-NEXT: [[L0:%.+]] = load ptr, ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR]]
// DISABLE-NEXT: [[L1:%.+]] = load ptr, ptr [[L0]], align 8, !tbaa [[ANYPTR]]
// DEFAULT-NEXT: store ptr [[PTRA]], ptr [[PTR_ADDR]], align 8, !tbaa [[P2VOID:!.+]]
// DEFAULT-NEXT: [[L0:%.+]] = load ptr, ptr [[PTR_ADDR]], align 8, !tbaa [[P2VOID]]
// DEFAULT-NEXT: [[L1:%.+]] = load ptr, ptr [[L0]], align 8, !tbaa [[P1VOID:!.+]]
// COMMON-NEXT: store ptr [[PTRA]], ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR]]
// COMMON-NEXT: [[L0:%.+]] = load ptr, ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR]]
// COMMON-NEXT: [[L1:%.+]] = load ptr, ptr [[L0]], align 8, !tbaa [[ANYPTR]]
// COMMON-NEXT: [[BOOL:%.+]] = icmp ne ptr [[L1]], null
// COMMON-NEXT: [[BOOL_EXT:%.+]] = zext i1 [[BOOL]] to i64
// COMMON-NEXT: [[COND:%.+]] = select i1 [[BOOL]], i32 0, i32 1
Expand Down Expand Up @@ -254,7 +251,3 @@ int void_ptrs(void **ptr) {
// COMMON: [[INT_TAG]] = !{[[INT_TY:!.+]], [[INT_TY]], i64 0}
// COMMON: [[INT_TY]] = !{!"int", [[CHAR]], i64 0}
// DEFAULT: [[ANYPTR]] = !{[[ANY_POINTER]], [[ANY_POINTER]], i64 0}
// DEFAULT: [[P2VOID]] = !{[[P2VOID_TY:!.+]], [[P2VOID_TY]], i64 0}
// DEFAULT: [[P2VOID_TY]] = !{!"p2 void", [[ANY_POINTER]], i64 0}
// DEFAULT: [[P1VOID]] = !{[[P1VOID_TY:!.+]], [[P1VOID_TY]], i64 0}
// DEFAULT: [[P1VOID_TY]] = !{!"p1 void", [[ANY_POINTER]], i64 0}
25 changes: 12 additions & 13 deletions clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl
Original file line number Diff line number Diff line change
Expand Up @@ -651,7 +651,7 @@ kernel void test_target_features_kernel(global int *i) {
//
// GFX900: Function Attrs: convergent nounwind
// GFX900-LABEL: define {{[^@]+}}@__test_block_invoke_3_kernel
// GFX900-SAME: (<{ i32, i32, ptr, ptr addrspace(1), ptr addrspace(1), i64, i8 }> [[TMP0:%.*]], ptr addrspace(3) [[TMP1:%.*]]) #[[ATTR6]] !kernel_arg_addr_space [[META28:![0-9]+]] !kernel_arg_access_qual [[META29:![0-9]+]] !kernel_arg_type [[META30:![0-9]+]] !kernel_arg_base_type [[META30]] !kernel_arg_type_qual [[META31:![0-9]+]] {
// GFX900-SAME: (<{ i32, i32, ptr, ptr addrspace(1), ptr addrspace(1), i64, i8 }> [[TMP0:%.*]], ptr addrspace(3) [[TMP1:%.*]]) #[[ATTR6]] !kernel_arg_addr_space [[META27:![0-9]+]] !kernel_arg_access_qual [[META28:![0-9]+]] !kernel_arg_type [[META29:![0-9]+]] !kernel_arg_base_type [[META29]] !kernel_arg_type_qual [[META30:![0-9]+]] {
// GFX900-NEXT: entry:
// GFX900-NEXT: [[TMP2:%.*]] = alloca <{ i32, i32, ptr, ptr addrspace(1), ptr addrspace(1), i64, i8 }>, align 8, addrspace(5)
// GFX900-NEXT: store <{ i32, i32, ptr, ptr addrspace(1), ptr addrspace(1), i64, i8 }> [[TMP0]], ptr addrspace(5) [[TMP2]], align 8
Expand Down Expand Up @@ -688,7 +688,7 @@ kernel void test_target_features_kernel(global int *i) {
//
// GFX900: Function Attrs: convergent norecurse nounwind
// GFX900-LABEL: define {{[^@]+}}@test_target_features_kernel
// GFX900-SAME: (ptr addrspace(1) noundef align 4 [[I:%.*]]) #[[ATTR2]] !kernel_arg_addr_space [[META32:![0-9]+]] !kernel_arg_access_qual [[META23]] !kernel_arg_type [[META33:![0-9]+]] !kernel_arg_base_type [[META33]] !kernel_arg_type_qual [[META25]] {
// GFX900-SAME: (ptr addrspace(1) noundef align 4 [[I:%.*]]) #[[ATTR2]] !kernel_arg_addr_space [[META31:![0-9]+]] !kernel_arg_access_qual [[META23]] !kernel_arg_type [[META32:![0-9]+]] !kernel_arg_base_type [[META32]] !kernel_arg_type_qual [[META25]] {
// GFX900-NEXT: entry:
// GFX900-NEXT: [[I_ADDR:%.*]] = alloca ptr addrspace(1), align 8, addrspace(5)
// GFX900-NEXT: [[DEFAULT_QUEUE:%.*]] = alloca ptr addrspace(1), align 8, addrspace(5)
Expand All @@ -700,7 +700,7 @@ kernel void test_target_features_kernel(global int *i) {
// GFX900-NEXT: [[FLAGS_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[FLAGS]] to ptr
// GFX900-NEXT: [[NDRANGE_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[NDRANGE]] to ptr
// GFX900-NEXT: [[TMP_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[TMP]] to ptr
// GFX900-NEXT: store ptr addrspace(1) [[I]], ptr [[I_ADDR_ASCAST]], align 8, !tbaa [[TBAA34:![0-9]+]]
// GFX900-NEXT: store ptr addrspace(1) [[I]], ptr [[I_ADDR_ASCAST]], align 8, !tbaa [[TBAA33:![0-9]+]]
// GFX900-NEXT: call void @llvm.lifetime.start.p5(i64 8, ptr addrspace(5) [[DEFAULT_QUEUE]]) #[[ATTR8]]
// GFX900-NEXT: call void @llvm.lifetime.start.p5(i64 4, ptr addrspace(5) [[FLAGS]]) #[[ATTR8]]
// GFX900-NEXT: store i32 0, ptr [[FLAGS_ASCAST]], align 4, !tbaa [[TBAA17]]
Expand Down Expand Up @@ -803,16 +803,15 @@ kernel void test_target_features_kernel(global int *i) {
// GFX900: [[META23]] = !{!"none"}
// GFX900: [[META24]] = !{!"__block_literal"}
// GFX900: [[META25]] = !{!""}
// GFX900: [[TBAA26]] = !{[[META27:![0-9]+]], [[META27]], i64 0}
// GFX900: [[META27]] = !{!"p1 void", [[META9]], i64 0}
// GFX900: [[META28]] = !{i32 0, i32 3}
// GFX900: [[META29]] = !{!"none", !"none"}
// GFX900: [[META30]] = !{!"__block_literal", !"void*"}
// GFX900: [[META31]] = !{!"", !""}
// GFX900: [[META32]] = !{i32 1}
// GFX900: [[META33]] = !{!"int*"}
// GFX900: [[TBAA34]] = !{[[META35:![0-9]+]], [[META35]], i64 0}
// GFX900: [[META35]] = !{!"p1 int", [[META9]], i64 0}
// GFX900: [[TBAA26]] = !{[[META9]], [[META9]], i64 0}
// GFX900: [[META27]] = !{i32 0, i32 3}
// GFX900: [[META28]] = !{!"none", !"none"}
// GFX900: [[META29]] = !{!"__block_literal", !"void*"}
// GFX900: [[META30]] = !{!"", !""}
// GFX900: [[META31]] = !{i32 1}
// GFX900: [[META32]] = !{!"int*"}
// GFX900: [[TBAA33]] = !{[[META34:![0-9]+]], [[META34]], i64 0}
// GFX900: [[META34]] = !{!"p1 int", [[META9]], i64 0}
//.
//// NOTE: These prefixes are unused and the list is autogenerated. Do not add tests below this line:
// CHECK: {{.*}}
12 changes: 3 additions & 9 deletions clang/unittests/CodeGen/TBAAMetadataTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -117,15 +117,9 @@ TEST(TBAAMetadataTest, BasicTypes) {
ASSERT_TRUE(I);

I = matchNext(I,
MInstruction(Instruction::Store,
MValType(PointerType::getUnqual(Compiler.Context)),
MMTuple(
MMTuple(
MMString("p1 void"),
AnyPtr,
MConstInt(0)),
MSameAs(0),
MConstInt(0))));
MInstruction(Instruction::Store,
MValType(PointerType::getUnqual(Compiler.Context)),
MMTuple(AnyPtr, MSameAs(0), MConstInt(0))));
ASSERT_TRUE(I);

I = matchNext(I,
Expand Down