Description
Please consider this function:
define i32 @f(i1 %false, i64 %zero) {
entry:
%a = alloca i32
store i32 1, ptr %a
br i1 %false, label %bb.1, label %bb.2
bb.1:
%x = load i64, ptr %a
store i64 %x, ptr %a
br label %bb.2
bb.2:
br i1 %false, label %bb.4, label %bb.3
bb.3:
%j = getelementptr i32, ptr %a, i64 %zero
store i32 3, ptr %j
br label %bb.4
bb.4:
%z = load i32, ptr %a
ret i32 %z
}
This is a valid function to call with two zero arguments, in which case %bb.1
is skipped, %bb.3
is entered, 3
is stored to %a
, and in %bb.4
, 3
is then returned.
If this function is called with any other argument, the behavior is undefined. If %false
is non-zero, %bb.1
is entered and %a
is accessed past its end. If %false
is zero but %zero
is non-zero, %bb.3
stores past %a
's end.
GVN transforms this (opt -passes=gvn
, LLVM commit 09c00b6) to:
define i32 @f(i1 %false, i64 %zero) {
entry:
%a = alloca i32, align 4
store i32 1, ptr %a, align 4
br i1 %false, label %bb.1, label %bb.2
bb.1:
%x = load i64, ptr %a, align 4
store i64 %x, ptr %a, align 4
%0 = trunc i64 %x to i32
br label %bb.2
bb.2:
%z = phi i32 [ %0, %bb.1 ], [ 1, %entry ]
br i1 %false, label %bb.4, label %bb.3
bb.3:
%j = getelementptr i32, ptr %a, i64 %zero
store i32 3, ptr %j, align 4
br label %bb.4
bb.4:
ret i32 %z
}
If called with two zero arguments, %bb.3
is still entered and still stores 3
to %a
, but the load in %bb.4
has been mis-optimized to its earlier value.
Alive agrees that this transformation is invalid: https://alive2.llvm.org/ce/z/EhdxXU
I have not fully understood yet what is happening, but at first glance, it appears that this bit in MemoryDependenceResults::getNonLocalPointerDepFromBB
is wrong:
// This query's Size is less than the cached one. Conservatively restart
// the query using the greater size.
return getNonLocalPointerDepFromBB(
QueryInst, Pointer, Loc.getWithNewSize(CacheInfo->Size), isLoad,
StartBB, Result, Visited, SkipFirstBlock, IsIncomplete);
This assumes that querying with a greater size can only find additional possible dependencies, but it ignores that a greater size may cause dependencies to be left out if that size is greater than the memory region.