Skip to content

[LoopVectorize] Add support for reverse loops in isDereferenceableAndAlignedInLoop #96752

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jan 15, 2025

Conversation

david-arm
Copy link
Contributor

@david-arm david-arm commented Jun 26, 2024

Currently when we encounter a negative step in the induction
variable isDereferenceableAndAlignedInLoop bails out because
the element size is signed greater than the step. This patch
adds support for negative steps in cases where we detect the
start address for the load is of the form base + offset. In
this case the address decrements in each iteration so we need
to calculate the access size differently. I have done this by
caling getStartAndEndForAccess from LoopAccessAnalysis.cpp.

The motivation for this patch comes from PR #88385 where a
reviewer requested reusing isDereferenceableAndAlignedInLoop,
but that PR itself does support reverse loops.

The changed test in LoopVectorize/X86/load-deref-pred.ll now
passes because previously we were calculating the total access
size incorrectly, whereas now it is 412 bytes and fits
perfectly into the alloca.

@llvmbot
Copy link
Member

llvmbot commented Jun 26, 2024

@llvm/pr-subscribers-backend-systemz
@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-analysis

Author: David Sherwood (david-arm)

Changes

Currently when we encounter a negative step in the induction
variable isDereferenceableAndAlignedInLoop bails out because
the element size is signed greater than the step. This patch
adds support for negative steps in cases where we detect the
start address for the load is of the form base + offset. In
this case the address decrements in each iteration so we need
to calculate the access size differently.

The motivation for this patch comes from PR #88385 where a
reviewer requested reusing isDereferenceableAndAlignedInLoop,
but that PR itself does support reverse loops.


Patch is 22.41 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/96752.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/Loads.cpp (+23-7)
  • (modified) llvm/test/Transforms/LoopVectorize/load-deref-pred-align.ll (+241-14)
diff --git a/llvm/lib/Analysis/Loads.cpp b/llvm/lib/Analysis/Loads.cpp
index 2b8197066e8e9..18431da682891 100644
--- a/llvm/lib/Analysis/Loads.cpp
+++ b/llvm/lib/Analysis/Loads.cpp
@@ -293,7 +293,9 @@ bool llvm::isDereferenceableAndAlignedInLoop(LoadInst *LI, Loop *L,
 
   // TODO: Handle overlapping accesses.
   // We should be computing AccessSize as (TC - 1) * Step + EltSize.
-  if (EltSize.sgt(Step->getAPInt()))
+  bool StepIsNegative = Step->getAPInt().isNegative();
+  APInt AbsStep = Step->getAPInt().abs();
+  if (EltSize.ugt(AbsStep))
     return false;
 
   // Compute the total access size for access patterns with unit stride and
@@ -301,13 +303,14 @@ bool llvm::isDereferenceableAndAlignedInLoop(LoadInst *LI, Loop *L,
   // same.
   // For patterns with gaps (i.e. non unit stride), we are
   // accessing EltSize bytes at every Step.
-  APInt AccessSize = TC * Step->getAPInt();
+  APInt AccessSize = TC * AbsStep;
 
   assert(SE.isLoopInvariant(AddRec->getStart(), L) &&
          "implied by addrec definition");
   Value *Base = nullptr;
   if (auto *StartS = dyn_cast<SCEVUnknown>(AddRec->getStart())) {
-    Base = StartS->getValue();
+    if (!StepIsNegative)
+      Base = StartS->getValue();
   } else if (auto *StartS = dyn_cast<SCEVAddExpr>(AddRec->getStart())) {
     // Handle (NewBase + offset) as start value.
     const auto *Offset = dyn_cast<SCEVConstant>(StartS->getOperand(0));
@@ -318,11 +321,24 @@ bool llvm::isDereferenceableAndAlignedInLoop(LoadInst *LI, Loop *L,
       // TODO: generalize if a case found which warrants
       if (Offset->getAPInt().urem(Alignment.value()) != 0)
         return false;
+      if (StepIsNegative) {
+        // In the last iteration of the loop the address we access we will be
+        // lower than the first by (TC - 1) * Step. So we need to make sure
+        // that there is enough room in Offset to accomodate this.
+        APInt SubOffset = (TC - 1) * AbsStep;
+        if (Offset->getAPInt().ult(SubOffset))
+          return false;
+        // We can safely use the new base because the decrementing pointer is
+        // always guaranteed to be >= new base. The total access size needs to
+        // take into account the start offset and the loaded element size.
+        AccessSize = Offset->getAPInt() + EltSize;
+      } else {
+        bool Overflow = false;
+        AccessSize = AccessSize.uadd_ov(Offset->getAPInt(), Overflow);
+        if (Overflow)
+          return false;
+      }
       Base = NewBase->getValue();
-      bool Overflow = false;
-      AccessSize = AccessSize.uadd_ov(Offset->getAPInt(), Overflow);
-      if (Overflow)
-        return false;
     }
   }
 
diff --git a/llvm/test/Transforms/LoopVectorize/load-deref-pred-align.ll b/llvm/test/Transforms/LoopVectorize/load-deref-pred-align.ll
index 8d4be05a4390e..88802a48afa7a 100644
--- a/llvm/test/Transforms/LoopVectorize/load-deref-pred-align.ll
+++ b/llvm/test/Transforms/LoopVectorize/load-deref-pred-align.ll
@@ -17,7 +17,7 @@ define i16 @test_access_size_not_multiple_of_align(i64 %len, ptr %test_base) {
 ; CHECK-NEXT:    br label [[VECTOR_BODY:%.*]]
 ; CHECK:       vector.body:
 ; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[PRED_LOAD_CONTINUE2:%.*]] ]
-; CHECK-NEXT:    [[VEC_PHI:%.*]] = phi <2 x i16> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP16:%.*]], [[PRED_LOAD_CONTINUE2]] ]
+; CHECK-NEXT:    [[VEC_PHI:%.*]] = phi <2 x i16> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP15:%.*]], [[PRED_LOAD_CONTINUE2]] ]
 ; CHECK-NEXT:    [[TMP0:%.*]] = add i64 [[INDEX]], 0
 ; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr inbounds i8, ptr [[TEST_BASE:%.*]], i64 [[TMP0]]
 ; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds i8, ptr [[TMP1]], i32 0
@@ -43,16 +43,16 @@ define i16 @test_access_size_not_multiple_of_align(i64 %len, ptr %test_base) {
 ; CHECK:       pred.load.continue2:
 ; CHECK-NEXT:    [[TMP14:%.*]] = phi <2 x i16> [ [[TMP8]], [[PRED_LOAD_CONTINUE]] ], [ [[TMP13]], [[PRED_LOAD_IF1]] ]
 ; CHECK-NEXT:    [[PREDPHI:%.*]] = select <2 x i1> [[TMP3]], <2 x i16> [[TMP14]], <2 x i16> zeroinitializer
-; CHECK-NEXT:    [[TMP16]] = add <2 x i16> [[VEC_PHI]], [[PREDPHI]]
+; CHECK-NEXT:    [[TMP15]] = add <2 x i16> [[VEC_PHI]], [[PREDPHI]]
 ; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
-; CHECK-NEXT:    [[TMP17:%.*]] = icmp eq i64 [[INDEX_NEXT]], 4096
-; CHECK-NEXT:    br i1 [[TMP17]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
+; CHECK-NEXT:    [[TMP16:%.*]] = icmp eq i64 [[INDEX_NEXT]], 4096
+; CHECK-NEXT:    br i1 [[TMP16]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
 ; CHECK:       middle.block:
-; CHECK-NEXT:    [[TMP18:%.*]] = call i16 @llvm.vector.reduce.add.v2i16(<2 x i16> [[TMP16]])
+; CHECK-NEXT:    [[TMP17:%.*]] = call i16 @llvm.vector.reduce.add.v2i16(<2 x i16> [[TMP15]])
 ; CHECK-NEXT:    br i1 true, label [[LOOP_EXIT:%.*]], label [[SCALAR_PH]]
 ; CHECK:       scalar.ph:
 ; CHECK-NEXT:    [[BC_RESUME_VAL:%.*]] = phi i64 [ 4096, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.*]] ]
-; CHECK-NEXT:    [[BC_MERGE_RDX:%.*]] = phi i16 [ 0, [[ENTRY]] ], [ [[TMP18]], [[MIDDLE_BLOCK]] ]
+; CHECK-NEXT:    [[BC_MERGE_RDX:%.*]] = phi i16 [ 0, [[ENTRY]] ], [ [[TMP17]], [[MIDDLE_BLOCK]] ]
 ; CHECK-NEXT:    br label [[LOOP:%.*]]
 ; CHECK:       loop:
 ; CHECK-NEXT:    [[IV:%.*]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.*]], [[LATCH:%.*]] ]
@@ -72,7 +72,7 @@ define i16 @test_access_size_not_multiple_of_align(i64 %len, ptr %test_base) {
 ; CHECK-NEXT:    [[EXIT:%.*]] = icmp eq i64 [[IV]], 4095
 ; CHECK-NEXT:    br i1 [[EXIT]], label [[LOOP_EXIT]], label [[LOOP]], !llvm.loop [[LOOP3:![0-9]+]]
 ; CHECK:       loop_exit:
-; CHECK-NEXT:    [[ACCUM_NEXT_LCSSA:%.*]] = phi i16 [ [[ACCUM_NEXT]], [[LATCH]] ], [ [[TMP18]], [[MIDDLE_BLOCK]] ]
+; CHECK-NEXT:    [[ACCUM_NEXT_LCSSA:%.*]] = phi i16 [ [[ACCUM_NEXT]], [[LATCH]] ], [ [[TMP17]], [[MIDDLE_BLOCK]] ]
 ; CHECK-NEXT:    ret i16 [[ACCUM_NEXT_LCSSA]]
 ;
 entry:
@@ -114,7 +114,7 @@ define i32 @test_access_size_multiple_of_align_but_offset_by_1(i64 %len, ptr %te
 ; CHECK-NEXT:    br label [[VECTOR_BODY:%.*]]
 ; CHECK:       vector.body:
 ; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[PRED_LOAD_CONTINUE2:%.*]] ]
-; CHECK-NEXT:    [[VEC_PHI:%.*]] = phi <2 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP16:%.*]], [[PRED_LOAD_CONTINUE2]] ]
+; CHECK-NEXT:    [[VEC_PHI:%.*]] = phi <2 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP15:%.*]], [[PRED_LOAD_CONTINUE2]] ]
 ; CHECK-NEXT:    [[TMP0:%.*]] = add i64 [[INDEX]], 0
 ; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr inbounds i8, ptr [[TEST_BASE:%.*]], i64 [[TMP0]]
 ; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds i8, ptr [[TMP1]], i32 0
@@ -140,16 +140,16 @@ define i32 @test_access_size_multiple_of_align_but_offset_by_1(i64 %len, ptr %te
 ; CHECK:       pred.load.continue2:
 ; CHECK-NEXT:    [[TMP14:%.*]] = phi <2 x i32> [ [[TMP8]], [[PRED_LOAD_CONTINUE]] ], [ [[TMP13]], [[PRED_LOAD_IF1]] ]
 ; CHECK-NEXT:    [[PREDPHI:%.*]] = select <2 x i1> [[TMP3]], <2 x i32> [[TMP14]], <2 x i32> zeroinitializer
-; CHECK-NEXT:    [[TMP16]] = add <2 x i32> [[VEC_PHI]], [[PREDPHI]]
+; CHECK-NEXT:    [[TMP15]] = add <2 x i32> [[VEC_PHI]], [[PREDPHI]]
 ; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
-; CHECK-NEXT:    [[TMP17:%.*]] = icmp eq i64 [[INDEX_NEXT]], 4096
-; CHECK-NEXT:    br i1 [[TMP17]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP4:![0-9]+]]
+; CHECK-NEXT:    [[TMP16:%.*]] = icmp eq i64 [[INDEX_NEXT]], 4096
+; CHECK-NEXT:    br i1 [[TMP16]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP4:![0-9]+]]
 ; CHECK:       middle.block:
-; CHECK-NEXT:    [[TMP18:%.*]] = call i32 @llvm.vector.reduce.add.v2i32(<2 x i32> [[TMP16]])
+; CHECK-NEXT:    [[TMP17:%.*]] = call i32 @llvm.vector.reduce.add.v2i32(<2 x i32> [[TMP15]])
 ; CHECK-NEXT:    br i1 true, label [[LOOP_EXIT:%.*]], label [[SCALAR_PH]]
 ; CHECK:       scalar.ph:
 ; CHECK-NEXT:    [[BC_RESUME_VAL:%.*]] = phi i64 [ 4096, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.*]] ]
-; CHECK-NEXT:    [[BC_MERGE_RDX:%.*]] = phi i32 [ 0, [[ENTRY]] ], [ [[TMP18]], [[MIDDLE_BLOCK]] ]
+; CHECK-NEXT:    [[BC_MERGE_RDX:%.*]] = phi i32 [ 0, [[ENTRY]] ], [ [[TMP17]], [[MIDDLE_BLOCK]] ]
 ; CHECK-NEXT:    br label [[LOOP:%.*]]
 ; CHECK:       loop:
 ; CHECK-NEXT:    [[IV:%.*]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.*]], [[LATCH:%.*]] ]
@@ -169,7 +169,7 @@ define i32 @test_access_size_multiple_of_align_but_offset_by_1(i64 %len, ptr %te
 ; CHECK-NEXT:    [[EXIT:%.*]] = icmp eq i64 [[IV]], 4095
 ; CHECK-NEXT:    br i1 [[EXIT]], label [[LOOP_EXIT]], label [[LOOP]], !llvm.loop [[LOOP5:![0-9]+]]
 ; CHECK:       loop_exit:
-; CHECK-NEXT:    [[ACCUM_NEXT_LCSSA:%.*]] = phi i32 [ [[ACCUM_NEXT]], [[LATCH]] ], [ [[TMP18]], [[MIDDLE_BLOCK]] ]
+; CHECK-NEXT:    [[ACCUM_NEXT_LCSSA:%.*]] = phi i32 [ [[ACCUM_NEXT]], [[LATCH]] ], [ [[TMP17]], [[MIDDLE_BLOCK]] ]
 ; CHECK-NEXT:    ret i32 [[ACCUM_NEXT_LCSSA]]
 ;
 entry:
@@ -198,3 +198,230 @@ latch:
 loop_exit:
   ret i32 %accum.next
 }
+
+; Test reverse loops where we can prove loads in predicated blocks are safe
+; to load unconditionally.
+define void @test_rev_loops_deref_loads(ptr nocapture noundef writeonly %dest) {
+; CHECK-LABEL: @test_rev_loops_deref_loads(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[LOCAL_DEST:%.*]] = alloca [1024 x i32], align 4
+; CHECK-NEXT:    [[LOCAL_SRC:%.*]] = alloca [1024 x i32], align 4
+; CHECK-NEXT:    [[LOCAL_CMP:%.*]] = alloca [1024 x i32], align 4
+; CHECK-NEXT:    call void @init(ptr [[LOCAL_SRC]])
+; CHECK-NEXT:    call void @init(ptr [[LOCAL_CMP]])
+; CHECK-NEXT:    br i1 false, label [[SCALAR_PH:%.*]], label [[VECTOR_PH:%.*]]
+; CHECK:       vector.ph:
+; CHECK-NEXT:    br label [[VECTOR_BODY:%.*]]
+; CHECK:       vector.body:
+; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[PRED_STORE_CONTINUE4:%.*]] ]
+; CHECK-NEXT:    [[OFFSET_IDX:%.*]] = sub i64 1023, [[INDEX]]
+; CHECK-NEXT:    [[TMP0:%.*]] = add i64 [[OFFSET_IDX]], 0
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr inbounds [1024 x i32], ptr [[LOCAL_CMP]], i64 0, i64 [[TMP0]]
+; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds i32, ptr [[TMP1]], i32 0
+; CHECK-NEXT:    [[TMP3:%.*]] = getelementptr inbounds i32, ptr [[TMP2]], i32 -1
+; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <2 x i32>, ptr [[TMP3]], align 4
+; CHECK-NEXT:    [[REVERSE:%.*]] = shufflevector <2 x i32> [[WIDE_LOAD]], <2 x i32> poison, <2 x i32> <i32 1, i32 0>
+; CHECK-NEXT:    [[TMP4:%.*]] = icmp eq <2 x i32> [[REVERSE]], <i32 3, i32 3>
+; CHECK-NEXT:    [[TMP5:%.*]] = xor <2 x i1> [[TMP4]], <i1 true, i1 true>
+; CHECK-NEXT:    [[TMP6:%.*]] = getelementptr [1024 x i32], ptr [[LOCAL_SRC]], i64 0, i64 [[TMP0]]
+; CHECK-NEXT:    [[TMP7:%.*]] = getelementptr i32, ptr [[TMP6]], i32 0
+; CHECK-NEXT:    [[TMP8:%.*]] = getelementptr i32, ptr [[TMP7]], i32 -1
+; CHECK-NEXT:    [[WIDE_LOAD1:%.*]] = load <2 x i32>, ptr [[TMP8]], align 4
+; CHECK-NEXT:    [[REVERSE2:%.*]] = shufflevector <2 x i32> [[WIDE_LOAD1]], <2 x i32> poison, <2 x i32> <i32 1, i32 0>
+; CHECK-NEXT:    [[TMP9:%.*]] = extractelement <2 x i1> [[TMP5]], i32 0
+; CHECK-NEXT:    br i1 [[TMP9]], label [[PRED_STORE_IF:%.*]], label [[PRED_STORE_CONTINUE:%.*]]
+; CHECK:       pred.store.if:
+; CHECK-NEXT:    [[TMP10:%.*]] = getelementptr inbounds [1024 x i32], ptr [[LOCAL_DEST]], i64 0, i64 [[TMP0]]
+; CHECK-NEXT:    [[TMP11:%.*]] = extractelement <2 x i32> [[REVERSE2]], i32 0
+; CHECK-NEXT:    [[TMP12:%.*]] = shl nsw i32 [[TMP11]], 2
+; CHECK-NEXT:    store i32 [[TMP12]], ptr [[TMP10]], align 4
+; CHECK-NEXT:    br label [[PRED_STORE_CONTINUE]]
+; CHECK:       pred.store.continue:
+; CHECK-NEXT:    [[TMP13:%.*]] = extractelement <2 x i1> [[TMP5]], i32 1
+; CHECK-NEXT:    br i1 [[TMP13]], label [[PRED_STORE_IF3:%.*]], label [[PRED_STORE_CONTINUE4]]
+; CHECK:       pred.store.if3:
+; CHECK-NEXT:    [[TMP14:%.*]] = add i64 [[OFFSET_IDX]], -1
+; CHECK-NEXT:    [[TMP15:%.*]] = getelementptr inbounds [1024 x i32], ptr [[LOCAL_DEST]], i64 0, i64 [[TMP14]]
+; CHECK-NEXT:    [[TMP16:%.*]] = extractelement <2 x i32> [[REVERSE2]], i32 1
+; CHECK-NEXT:    [[TMP17:%.*]] = shl nsw i32 [[TMP16]], 2
+; CHECK-NEXT:    store i32 [[TMP17]], ptr [[TMP15]], align 4
+; CHECK-NEXT:    br label [[PRED_STORE_CONTINUE4]]
+; CHECK:       pred.store.continue4:
+; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
+; CHECK-NEXT:    [[TMP18:%.*]] = icmp eq i64 [[INDEX_NEXT]], 1024
+; CHECK-NEXT:    br i1 [[TMP18]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP6:![0-9]+]]
+; CHECK:       middle.block:
+; CHECK-NEXT:    br i1 true, label [[FOR_COND_CLEANUP:%.*]], label [[SCALAR_PH]]
+; CHECK:       scalar.ph:
+; CHECK-NEXT:    [[BC_RESUME_VAL:%.*]] = phi i64 [ -1, [[MIDDLE_BLOCK]] ], [ 1023, [[ENTRY:%.*]] ]
+; CHECK-NEXT:    br label [[FOR_BODY:%.*]]
+; CHECK:       for.body:
+; CHECK-NEXT:    [[INDVARS_IV:%.*]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.*]], [[FOR_INC:%.*]] ]
+; CHECK-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds [1024 x i32], ptr [[LOCAL_CMP]], i64 0, i64 [[INDVARS_IV]]
+; CHECK-NEXT:    [[TMP19:%.*]] = load i32, ptr [[ARRAYIDX]], align 4
+; CHECK-NEXT:    [[CMP3_NOT:%.*]] = icmp eq i32 [[TMP19]], 3
+; CHECK-NEXT:    br i1 [[CMP3_NOT]], label [[FOR_INC]], label [[IF_THEN:%.*]]
+; CHECK:       if.then:
+; CHECK-NEXT:    [[ARRAYIDX5:%.*]] = getelementptr inbounds [1024 x i32], ptr [[LOCAL_SRC]], i64 0, i64 [[INDVARS_IV]]
+; CHECK-NEXT:    [[TMP20:%.*]] = load i32, ptr [[ARRAYIDX5]], align 4
+; CHECK-NEXT:    [[MUL:%.*]] = shl nsw i32 [[TMP20]], 2
+; CHECK-NEXT:    [[ARRAYIDX7:%.*]] = getelementptr inbounds [1024 x i32], ptr [[LOCAL_DEST]], i64 0, i64 [[INDVARS_IV]]
+; CHECK-NEXT:    store i32 [[MUL]], ptr [[ARRAYIDX7]], align 4
+; CHECK-NEXT:    br label [[FOR_INC]]
+; CHECK:       for.inc:
+; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add nsw i64 [[INDVARS_IV]], -1
+; CHECK-NEXT:    [[CMP2_NOT:%.*]] = icmp eq i64 [[INDVARS_IV]], 0
+; CHECK-NEXT:    br i1 [[CMP2_NOT]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop [[LOOP7:![0-9]+]]
+; CHECK:       for.cond.cleanup:
+; CHECK-NEXT:    call void @llvm.memcpy.p0.p0.i64(ptr noundef nonnull align 4 dereferenceable(1024) [[DEST:%.*]], ptr noundef nonnull align 4 dereferenceable(1024) [[LOCAL_DEST]], i64 1024, i1 false)
+; CHECK-NEXT:    ret void
+;
+entry:
+  %local_dest = alloca [1024 x i32], align 4
+  %local_src = alloca [1024 x i32], align 4
+  %local_cmp = alloca [1024 x i32], align 4
+  call void @init(ptr %local_src)
+  call void @init(ptr %local_cmp)
+  br label %for.body
+
+for.body:
+  %indvars.iv = phi i64 [ 1023, %entry ], [ %indvars.iv.next, %for.inc ]
+  %arrayidx = getelementptr inbounds [1024 x i32], ptr %local_cmp, i64 0, i64 %indvars.iv
+  %0 = load i32, ptr %arrayidx, align 4
+  %cmp3.not = icmp eq i32 %0, 3
+  br i1 %cmp3.not, label %for.inc, label %if.then
+
+if.then:
+  %arrayidx5 = getelementptr inbounds [1024 x i32], ptr %local_src, i64 0, i64 %indvars.iv
+  %1 = load i32, ptr %arrayidx5, align 4
+  %mul = shl nsw i32 %1, 2
+  %arrayidx7 = getelementptr inbounds [1024 x i32], ptr %local_dest, i64 0, i64 %indvars.iv
+  store i32 %mul, ptr %arrayidx7, align 4
+  br label %for.inc
+
+for.inc:
+  %indvars.iv.next = add nsw i64 %indvars.iv, -1
+  %cmp2.not = icmp eq i64 %indvars.iv, 0
+  br i1 %cmp2.not, label %for.cond.cleanup, label %for.body
+
+for.cond.cleanup:
+  call void @llvm.memcpy.p0.p0.i64(ptr noundef nonnull align 4 dereferenceable(1024) %dest, ptr noundef nonnull align 4 dereferenceable(1024) %local_dest, i64 1024, i1 false)
+  ret void
+}
+
+
+; Test reverse loops where we *cannot* prove loads in predicated blocks are safe
+; to load unconditionally.
+define void @test_rev_loops_non_deref_loads(ptr nocapture noundef writeonly %dest) {
+; CHECK-LABEL: @test_rev_loops_non_deref_loads(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[LOCAL_DEST:%.*]] = alloca [1024 x i32], align 4
+; CHECK-NEXT:    [[LOCAL_SRC:%.*]] = alloca [1024 x i32], align 4
+; CHECK-NEXT:    [[LOCAL_CMP:%.*]] = alloca [1024 x i32], align 4
+; CHECK-NEXT:    call void @init(ptr [[LOCAL_SRC]])
+; CHECK-NEXT:    call void @init(ptr [[LOCAL_CMP]])
+; CHECK-NEXT:    br i1 false, label [[SCALAR_PH:%.*]], label [[VECTOR_PH:%.*]]
+; CHECK:       vector.ph:
+; CHECK-NEXT:    br label [[VECTOR_BODY:%.*]]
+; CHECK:       vector.body:
+; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[PRED_STORE_CONTINUE2:%.*]] ]
+; CHECK-NEXT:    [[VEC_IND:%.*]] = phi <2 x i64> [ <i64 1023, i64 1022>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.*]], [[PRED_STORE_CONTINUE2]] ]
+; CHECK-NEXT:    [[TMP0:%.*]] = add <2 x i64> [[VEC_IND]], <i64 -1, i64 -1>
+; CHECK-NEXT:    [[TMP1:%.*]] = extractelement <2 x i64> [[TMP0]], i32 0
+; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds [1024 x i32], ptr [[LOCAL_CMP]], i64 0, i64 [[TMP1]]
+; CHECK-NEXT:    [[TMP3:%.*]] = getelementptr inbounds i32, ptr [[TMP2]], i32 0
+; CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds i32, ptr [[TMP3]], i32 -1
+; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <2 x i32>, ptr [[TMP4]], align 4
+; CHECK-NEXT:    [[REVERSE:%.*]] = shufflevector <2 x i32> [[WIDE_LOAD]], <2 x i32> poison, <2 x i32> <i32 1, i32 0>
+; CHECK-NEXT:    [[TMP5:%.*]] = icmp eq <2 x i32> [[REVERSE]], <i32 3, i32 3>
+; CHECK-NEXT:    [[TMP6:%.*]] = xor <2 x i1> [[TMP5]], <i1 true, i1 true>
+; CHECK-NEXT:    [[TMP7:%.*]] = extractelement <2 x i1> [[TMP6]], i32 0
+; CHECK-NEXT:    br i1 [[TMP7]], label [[PRED_STORE_IF:%.*]], label [[PRED_STORE_CONTINUE:%.*]]
+; CHECK:       pred.store.if:
+; CHECK-NEXT:    [[TMP8:%.*]] = extractelement <2 x i64> [[TMP0]], i32 0
+; CHECK-NEXT:    [[TMP9:%.*]] = getelementptr inbounds [1024 x i32], ptr [[LOCAL_SRC]], i64 0, i64 [[TMP8]]
+; CHECK-NEXT:    [[TMP10:%.*]] = load i32, ptr [[TMP9]], align 4
+; CHECK-NEXT:    [[TMP11:%.*]] = extractelement <2 x i64> [[TMP0]], i32 0
+; CHECK-NEXT:    [[TMP12:%.*]] = getelementptr inbounds [1024 x i32], ptr [[LOCAL_DEST]], i64 0, i64 [[TMP11]]
+; CHECK-NEXT:    [[TMP13:%.*]] = shl nsw i32 [[TMP10]], 2
+; CHECK-NEXT:    store i32 [[TMP13]], ptr [[TMP12]], align 4
+; CHECK-NEXT:    br label [[PRED_STORE_CONTINUE]]
+; CHECK:       pred.store.continue:
+; CHECK-NEXT:    [[TMP14:%.*]] = phi i32 [ poison, [[VECTOR_BODY]] ], [ [[TMP10]], [[PRED_STORE_IF]] ]
+; CHECK-NEXT:    [[TMP15:%.*]] = extractelement <2 x i1> [[TMP6]], i32 1
+; CHECK-NEXT:    br i1 [[TMP15]], label [[PRED_STORE_IF1:%.*]], label [[PRED_STORE_CONTINUE2]]
+; CHECK:       pred.store.if1:
+; CHECK-NEXT:    [[TMP16:%.*]] = extractelement <2 x i64> [[TMP0]], i32 1
+; CHECK-NEXT:    [[TMP17:%.*]] = getelementptr inbounds [1024 x i32], ptr [[LOCAL_SRC]], i64 0, i64 [[TMP16]]
+; CHECK-NEXT:    [[TMP18:%.*]] = load i32, ptr [[TMP17]], align 4
+; CHECK-NEXT:    [[TMP19:%.*]] = extractelement <2 x i64> [[TMP0]], i32 1
+; CHECK-NEXT:    [[TMP20:%.*]] = getelementptr inbounds [1024 x i32], ptr [[LOCAL_DEST]], i64 0, i64 [[TMP19]]
+; CHECK-NEXT:    [[TMP21:%.*]] = shl nsw i32 [[TMP18]], 2
+; CHECK-NEXT:    store i32 [[TMP21]], ptr [[TMP20]], align 4
+; CHECK-NEXT:    br label [[PRED_STORE_CONTINUE2]]
+; CHECK:       pred.store.continue2:
+; CHECK-NEXT:    [[TMP22:%.*]] = phi i32 [ poison, [[PRED_STORE_CONTINUE]] ], [ [[TMP18]], [[PRED_STORE_IF1]] ]
+; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
+; CHECK-NEXT:    [[VEC_IND_NEXT]] = add <2 x i64> [[VEC_IND]], <i64 -2, i64 -2>
+; CHECK-NEXT:    [[TMP23:%.*]] = icmp eq i64 [[INDEX_NEXT]], 1024
+; CHECK-NEXT:    br i1 [[TMP23]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP8:![0-9]+]]
+; CHECK:       middle.block:
+; CHECK-NEXT:    br i1 true, label [[FOR_COND_CLEANUP:%.*]], label [[SCALAR_PH]]
+; CHECK:       scalar.ph:
+; CHECK-NEXT:    [[BC_RESUME_VAL:%.*]] = phi i64 [ -1, [[MIDDLE_BLOCK]] ], [ 1023, [[ENTRY:%.*]] ]
+; CHECK-NEXT:    br label [[FOR_BODY:%.*]]
+; CHECK:       for.body:
+; CHECK-NEXT:    [[INDVARS_IV:%.*]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.*]], [[FOR_INC:%.*]] ]
+; CHECK-NEXT:    [[OFF:%.*]] = add i64 [[INDVARS_IV]], -1
+; CHECK-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds [1024 x i32], ptr [[LOCAL_CMP]], i64 0, i64 [[OFF]]
+; CHECK-NEXT:    [[TMP24:%.*...
[truncated]

// We can safely use the new base because the decrementing pointer is
// always guaranteed to be >= new base. The total access size needs to
// take into account the start offset and the loaded element size.
AccessSize = Offset->getAPInt() + EltSize;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need an overflow check and bail out in case we have a narrow IV where incrementing by EltSize wraps? Alternatively we could probably also perform the computation in always in something like i64?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I've added the overflow check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update! Do you think it would be possible to add a test case that covers the overflow case (e.g. with an i8 induction that wraps when adding the element size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried doing that with an i8, but the offset return by SCEV is always 64-bit. So in practice I'd need a test with an alloca just less than UINT64_MAX bytes in size and an offset just less than UINT64_MAX, in order to exercise the overflow case. I wasn't sure if this was fragile, i.e. are allocas of such a large size even legal given that I don't think GEPs even support an offset greater than SINT64_MAX (since offsets are signed)? Having said that, I'm open to ideas about how to write a sensible test for this!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would something like the below work? I might be missing something, but it looks like the overflow check should trigger there?


target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128-ni:1"

declare void @init(ptr nocapture nofree)


define void @test(ptr nocapture noundef writeonly %dest) {

entry:
  %local_dest = alloca [128 x i8], align 4
  %local_src = alloca [128 x i8], align 4
  %local_cmp = alloca [128 x i8], align 4
  call void @init(ptr %local_src)
  call void @init(ptr %local_cmp)
  br label %for.body

for.body:
  %iv = phi i8 [ 255, %entry ], [ %iv.next, %for.inc ]
  %arrayidx = getelementptr inbounds [1024 x i8], ptr %local_cmp, i8 0, i8 %iv
  %0 = load i8, ptr %arrayidx, align 1
  %cmp3.not = icmp eq i8 %0, 3
  br i1 %cmp3.not, label %for.inc, label %if.then

if.then:
  %arrayidx5 = getelementptr inbounds [1024 x i8], ptr %local_src, i8 0, i8 %iv
  %1 = load i8, ptr %arrayidx5, align 1
  %mul = shl nsw i8 %1, 2
  %arrayidx7 = getelementptr inbounds [1024 x i8], ptr %local_dest, i8 0, i8 %iv
  store i8 %mul, ptr %arrayidx7, align 1
  br label %for.inc

for.inc:
  %iv.next = add nsw i8 %iv, -1
  %cmp2.not = icmp eq i8 %iv, 200
  br i1 %cmp2.not, label %for.cond.cleanup, label %for.body

for.cond.cleanup:
  call void @llvm.memcpy.p0.p0.i8(ptr noundef nonnull align 4 dereferenceable(1024) %dest, ptr noundef nonnull align 4 dereferenceable(1024) %local_dest, i8 1024, i1 false)
  ret void
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or perhaps bail out entirely if the offset is negative?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might makes sense for now to bail out for negative values - it doesn't cause any existing tests to fail. It would also make it almost impossible to write an overflow test because you'd need something like an offset of 0x7FFFFFFFFFFFFFFF and an element size of 0x7FFFFFFFFFFFFFFF

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to restrict this to non-negative values, thanks! It would probably make sense to split off the patch with the extra bail-out separately. Also adding @preames who added some of that code originally I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I originally thought about splitting off the change in a separate patch, but I couldn't see any way of testing it by itself because the overflow checks catch this by accident. If you want I can split it off into a separate PR - there just won't be any tests for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #99490 for the negative offset check.

@david-arm
Copy link
Contributor Author

  • Rebase + fix tests after rebase.

@fhahn fhahn requested a review from preames July 16, 2024 13:32
@preames
Copy link
Collaborator

preames commented Jul 16, 2024

An alternate approach here would be to compute the AccessSize as a SCEV (which should just be umax(End,Start)-umin(Start,End) + ElementSize, and then check to see if it's a SCEVConstant. End should be evaluateAtIteration on the AddRec. The base would then be umin(End, Start) as a SCEV, and then see if it's an understandable one.

In general, doing the math manually seems like a recipe for error here.

@david-arm
Copy link
Contributor Author

An alternate approach here would be to compute the AccessSize as a SCEV (which should just be umax(End,Start)-umin(Start,End) + ElementSize, and then check to see if it's a SCEVConstant. End should be evaluateAtIteration on the AddRec. The base would then be umin(End, Start) as a SCEV, and then see if it's an understandable one.

In general, doing the math manually seems like a recipe for error here.

Thanks for this suggestion @preames! It was a bit fiddly to get it working without using SCEVExpander, but I do think it's now less prone to error. As part of this patch I've also added some tests for loops with strided accesses for positive and negative steps too as we didn't seem to be testing that.

@david-arm
Copy link
Contributor Author

  • I've rebased the patch after landing the negative offset test. I also had to fix up a new test added recently.

@david-arm
Copy link
Contributor Author

@preames @fhahn After rebasing my early exit auto-vectorisation patch on top of this one, I realised some of the tests no longer vectorise because isDereferenceAndAlignedInLoop now returns false for some cases. In particular, if a function argument has the attribute dereferenceable(100) then the SCEV min/max code fails to return a SCEVConstant. This is because it is unable to determine that a calculation such as (ptr + offset) will not wrap. I don't think there is a fundamental issue blocking this - it's just that the function ScalarEvolution::getRangeRef knows about allocas, but doesn't know about the dereferenceable attribute.

So the question is - are we happy with accepting these minor regressions when using SCEV for now? Or is it worth creating a new patch to teach ScalarEvolution::getRangeRef new tricks first?

Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@preames @fhahn After rebasing my early exit auto-vectorisation patch on top of this one, I realised some of the tests no longer vectorise because isDereferenceAndAlignedInLoop now returns false for some cases. In particular, if a function argument has the attribute dereferenceable(100) then the SCEV min/max code fails to return a SCEVConstant. This is because it is unable to determine that a calculation such as (ptr + offset) will not wrap. I don't think there is a fundamental issue blocking this - it's just that the function ScalarEvolution::getRangeRef knows about allocas, but doesn't know about the dereferenceable attribute.

So the question is - are we happy with accepting these minor regressions when using SCEV for now? Or is it worth creating a new patch to teach ScalarEvolution::getRangeRef new tricks first?

That's interesting! Is it possible that the current tests don't catch this issue because load-deref-pred-align.ll doesn't have a variant with dereferenceable()?

const auto *Offset = dyn_cast<SCEVConstant>(StartS->getOperand(0));
const auto *NewBase = dyn_cast<SCEVUnknown>(StartS->getOperand(1));
if (StartS->getNumOperands() == 2 && Offset && NewBase) {
const SCEV *End = AddRec->evaluateAtIteration(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's possible, but it's not a drop-in replacement and is quite inefficient in the current context. For example, getStartAndEndForAccess recalculates the trip count and element size which we've already calculated, although I could rip out a lot of the code in isDereferenceableAndAlignedInLoop so that I avoid doing this. The function also requires passing in a PointerBounds DenseMap, which we don't care about. I could probably refactor getStartAndEndForAccess by adding a new variant that doesn't require a map. Ultimately I will still need to extract an actual base Value* pointer from a SCEV, but perhaps this would reduce the overall complexity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I've done some refactoring and taught isDereferenceableAndAlignedInLoop to call getStartAndEndForAccess. This at least reduces some of the complexity and is actually an improvement on the existing code.

return false;

// Now calculate the total access size, which is (max - min) + element_size.
const SCEV *Diff = SE.getMinusSCEV(Max, Min);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the separate handling for SCEVUnknown and SCEVAddExpr still needed when expanding the bounds here? I'd expect that computing start/end of the add rec would work for both cases without the need for a distinction?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think so. That's because the Min value is a SCEV* whereas we need a Value* and you can only do that by analysing the SCEV type and handling it on a case-by-case basis. We cannot use the SCEVExpander here because that would create new IR. This is the downside to using SCEV here.

if (!ASC)
return false;

if (const SCEVUnknown *NewBase = dyn_cast<SCEVUnknown>(Min)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Retrieving the new base and access size here seems to unfortunately add a bit of complexity.

What could make this much simpler would be if we would get the base pointer (underlying object) and the # of known dereferenceable bits. Then construct SCEVs for the start and end of the object and check if the Start/End of the accessed pointer is >= and <= the bounds of the object.

https://github.com/llvm/llvm-project/blob/main/llvm/lib/Analysis/Loads.cpp#L97 would highlight a way to get the number of dereferenceable bytes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do certainly, but then we're effectively doing some of the work that's already done in isDereferenceableAndAlignedPointer. It's almost like we want to call isDereferenceableAndAlignedPointer on both the min and max addressed pointers in the loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I still think we end up with a similar problem in that we need to discover the base Value* in order to call getPointerDereferenceableBytes. I understand what you're suggesting, but not sure it will reduce the complexity in practice. I'll have a think about it some more.

@david-arm
Copy link
Contributor Author

That's interesting! Is it possible that the current tests don't catch this issue because load-deref-pred-align.ll doesn't have a variant with dereferenceable()?

Yeah that's right, whereas my early exit autovec tests have a test for dereferenceable pointers. It turns out there is a relatively simple fix by teaching ScalarEvolution::getRangeRef to use Value::getPointerDereferenceableBytes. If I get time before I go away on holiday I'll get a patch up for this!

david-arm added a commit to david-arm/llvm-project that referenced this pull request Aug 19, 2024
…ble objects

Whilst dealing with review comments on

llvm#96752

I discovered that SCEV does not know about the dereferenceable
attribute on function arguments so I have updated getRangeRef
to make use of it by calling getPointerDereferenceableBytes.
@david-arm
Copy link
Contributor Author

Rebased.

@@ -3,7 +3,7 @@

target datalayout = "E-m:e-i1:8:16-i8:8:16-i64:64-f128:64-v128:64-a:8:16-n32:64"

@src = external global [8 x i32], align 4
declare void @init()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was for an earlier version of the patch, but it's no longer required. I've reverted it now!


const SCEV *ScStart;
const SCEV *ScEnd;

if (SE->isLoopInvariant(PtrExpr, Lp)) {
ScStart = ScEnd = PtrExpr;
} else if (auto *AR = dyn_cast<SCEVAddRecExpr>(PtrExpr)) {
const SCEV *Ex = PSE.getSymbolicMaxBackedgeTakenCount();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as PSE caches the max symbolic BTC (I think) it's probably better to avoid the extra parameter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extra parameter is required in order to reuse this function in Loads.cpp, because isDereferenceableAndAlignedInLoop only has access to ScalarEvolution. In the places where getStartAndEndForAccess is called below we still call PSE.getSymbolicMaxBackedgeTakenCount() anyway so it will be cached.

The alternative would be to have two versions of getStartAndEndForAccess - one with a ScalarEvolution parameter and the other with a PredicatedScalarEvolution - but that's not great for code reuse.

@@ -839,6 +839,12 @@ bool sortPtrAccesses(ArrayRef<Value *> VL, Type *ElemTy, const DataLayout &DL,
bool isConsecutiveAccess(Value *A, Value *B, const DataLayout &DL,
ScalarEvolution &SE, bool CheckType = true);

std::pair<const SCEV *, const SCEV *> getStartAndEndForAccess(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move the comment from llvm/lib/Analysis/LoopAccessAnalysis.cpp, as it is more precise?

I think from the current description, it sounds like the returned value is the last address accessed in the loop, but unless I am missing something it is actually (address of last access)+access size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

static std::pair<const SCEV *, const SCEV *> getStartAndEndForAccess(
const Loop *Lp, const SCEV *PtrExpr, Type *AccessTy,
PredicatedScalarEvolution &PSE,
std::pair<const SCEV *, const SCEV *> llvm::getStartAndEndForAccess(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be good to add an overload without the extra EltSize argument for users here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

br label %for.body

for.body:
%indvars.iv = phi i64 [ 511, %entry ], [ %indvars.iv.next, %for.inc ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
%indvars.iv = phi i64 [ 511, %entry ], [ %indvars.iv.next, %for.inc ]
%iv = phi i64 [ 511, %entry ], [ %indvars.iv.next, %for.inc ]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

br i1 %cmp2.not, label %for.cond.cleanup, label %for.body

for.cond.cleanup:
call void @llvm.memcpy.p0.p0.i64(ptr noundef nonnull align 4 dereferenceable(1024) %dest, ptr noundef nonnull align 4 dereferenceable(1024) %local_dest, i64 1024, i1 false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
call void @llvm.memcpy.p0.p0.i64(ptr noundef nonnull align 4 dereferenceable(1024) %dest, ptr noundef nonnull align 4 dereferenceable(1024) %local_dest, i64 1024, i1 false)
call void @llvm.memcpy.p0.p0.i64(ptr %dest, ptr %local_dest, i64 1024, i1 false)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

br label %for.body

for.body:
%indvars.iv = phi i64 [ 1023, %entry ], [ %indvars.iv.next, %for.inc ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
%indvars.iv = phi i64 [ 1023, %entry ], [ %indvars.iv.next, %for.inc ]
%iv = phi i64 [ 1023, %entry ], [ %indvars.iv.next, %for.inc ]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

br label %for.inc

for.inc:
%indvars.iv.next = add nsw i64 %indvars.iv, -1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
%indvars.iv.next = add nsw i64 %indvars.iv, -1
%iv.next = add nsw i64 %indvars.iv, -1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 305 to 308
if (isa<SCEVConstant>(PtrDiff))
MaxPtrDiff = cast<SCEVConstant>(PtrDiff)->getAPInt();
else
MaxPtrDiff = SE.getUnsignedRangeMax(PtrDiff);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sufficient to ?

Suggested change
if (isa<SCEVConstant>(PtrDiff))
MaxPtrDiff = cast<SCEVConstant>(PtrDiff)->getAPInt();
else
MaxPtrDiff = SE.getUnsignedRangeMax(PtrDiff);
MaxPtrDiff = SE.getUnsignedRangeMax(PtrDiff);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

return false;

const SCEV *EltSizeSCEV;
std::pair<const SCEV *, const SCEV *> Range = getStartAndEndForAccess(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::pair<const SCEV *, const SCEV *> Range = getStartAndEndForAccess(
const auto &[AccessStart, AccessEnd] = getStartAndEndForAccess(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

david-arm added a commit that referenced this pull request Aug 22, 2024
…ble objects (#104778)

Whilst dealing with review comments on

#96752

I discovered that SCEV does not know about the dereferenceable attribute
on function arguments so I have updated getRangeRef to make use of it
by calling getPointerDereferenceableBytes.
cjdb pushed a commit to cjdb/llvm-project that referenced this pull request Aug 23, 2024
…ble objects (llvm#104778)

Whilst dealing with review comments on

llvm#96752

I discovered that SCEV does not know about the dereferenceable attribute
on function arguments so I have updated getRangeRef to make use of it
by calling getPointerDereferenceableBytes.
@david-arm
Copy link
Contributor Author

Rebase and some minor refactoring to hopefully make it look a bit neater.

@david-arm
Copy link
Contributor Author

Rebase.

Comment on lines 306 to 315
const SCEV *MaxBECount =
SE.getPredicatedSymbolicMaxBackedgeTakenCount(L, *Predicates);
if (isa<SCEVCouldNotCompute>(MaxBECount))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extending to support predicated BTCs instead of constant trip counts should probably be done separately if possible?

It should probably also be possible add a test for predicated BTCs unless I am missing anything?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah sorry, you're right. It shouldn't be part of this change. I think I just did it whilst addressing one of your review comments about reusing getStartAndEndForAccess. It should be fine to pass in SE.getSmallConstantMaxTripCount for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I can't pass in the result of SE.getSmallConstantMaxTripCount anymore because it returns an unsigned value. Instead I have changed it to call SE.getPredicatedConstantMaxBackedgeTakenCount, which is what getSmallConstantMaxTripCount does anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, thanks!

fhahn added a commit that referenced this pull request Jan 5, 2025
This adds missing test coverage for isDereferenceableAndAlignedInLoop,
related to #96752.

// If given a uniform (i.e. non-varying) address, see if we can prove the
// access is safe within the loop w/o needing predication.
if (L->isLoopInvariant(Ptr))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this code path was previously not tested and dropping it may cause a regression. I added 747f7f3 which should cover this codepath.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem. Thanks for adding the test! I've re-added the code now.

Comment on lines 306 to 315
const SCEV *MaxBECount =
SE.getPredicatedSymbolicMaxBackedgeTakenCount(L, *Predicates);
if (isa<SCEVCouldNotCompute>(MaxBECount))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, thanks!

github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Jan 10, 2025
…d assumption.

This adds missing test coverage for isDereferenceableAndAlignedInLoop,
related to llvm/llvm-project#96752.
Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Just one comment remaining re moving the existing comment for getStartAndEndForAccess

@@ -839,6 +839,12 @@ bool sortPtrAccesses(ArrayRef<Value *> VL, Type *ElemTy, const DataLayout &DL,
bool isConsecutiveAccess(Value *A, Value *B, const DataLayout &DL,
ScalarEvolution &SE, bool CheckType = true);

std::pair<const SCEV *, const SCEV *> getStartAndEndForAccess(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move the comment from llvm/lib/Analysis/LoopAccessAnalysis.cpp, as it is more precise?

I think from the current description, it sounds like the returned value is the last address accessed in the loop, but unless I am missing something it is actually (address of last access)+access size.

…AlignedInLoop

Currently when we encounter a negative step in the induction
variable isDereferenceableAndAlignedInLoop bails out because
the element size is signed greater than the step. This patch
adds support for negative steps in cases where we detect the
start address for the load is of the form base + offset. In
this case the address decrements in each iteration so we need
to calculate the access size differently. I have done this by
caling getStartAndEndForAccess from LoopAccessAnalysis.cpp.

The changed test in LoopVectorize/X86/load-deref-pred.ll now
passes because previously we were calculating the total access
size incorrectly, whereas now it is 412 bytes and fits
perfectly into the alloca.
@david-arm
Copy link
Contributor Author

Rebase + addressed review comment. I ran "make check-all" downstream after rebasing and the tests look good.

@david-arm david-arm merged commit bfedf64 into llvm:main Jan 15, 2025
8 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 15, 2025

LLVM Buildbot has detected a new failure on builder sanitizer-aarch64-linux running on sanitizer-buildbot7 while building llvm at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/51/builds/9122

Here is the relevant piece of the build log for the reference
Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
[182/186] Generating MSAN_INST_TEST_OBJECTS.msan_test.cpp.aarch64-with-call.o
[183/186] Generating Msan-aarch64-with-call-Test
[184/186] Generating MSAN_INST_TEST_OBJECTS.msan_test.cpp.aarch64.o
[185/186] Generating Msan-aarch64-Test
[185/186] Running compiler_rt regression tests
llvm-lit: /home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/utils/lit/lit/discovery.py:276: warning: input '/home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/interception/Unit' contained no tests
llvm-lit: /home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/utils/lit/lit/discovery.py:276: warning: input '/home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/sanitizer_common/Unit' contained no tests
llvm-lit: /home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 6190 tests, 72 workers --
Testing:  0.. 10.. 20.. 30..
FAIL: ORC-aarch64-linux :: TestCases/Generic/lazy-link.ll (2431 of 6190)
******************** TEST 'ORC-aarch64-linux :: TestCases/Generic/lazy-link.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 6: rm -rf /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp && mkdir -p /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp
+ rm -rf /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp
+ mkdir -p /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp
RUN: at line 7: /home/b/sanitizer-aarch64-linux/build/build_default/./bin/clang    -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta   -c -o /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp/foo.o /home/b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/orc/TestCases/Generic/Inputs/foo-ret-42.ll
+ /home/b/sanitizer-aarch64-linux/build/build_default/./bin/clang -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta -c -o /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp/foo.o /home/b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/orc/TestCases/Generic/Inputs/foo-ret-42.ll
warning: overriding the module target triple with aarch64-unknown-linux-gnu [-Woverride-module]
1 warning generated.
RUN: at line 8: /home/b/sanitizer-aarch64-linux/build/build_default/./bin/clang    -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta   -c -o /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp/x.o /home/b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/orc/TestCases/Generic/Inputs/var-x-42.ll
+ /home/b/sanitizer-aarch64-linux/build/build_default/./bin/clang -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta -c -o /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp/x.o /home/b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/orc/TestCases/Generic/Inputs/var-x-42.ll
warning: overriding the module target triple with aarch64-unknown-linux-gnu [-Woverride-module]
1 warning generated.
RUN: at line 9: /home/b/sanitizer-aarch64-linux/build/build_default/./bin/clang    -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta   -c -o /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp/main.o /home/b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/orc/TestCases/Generic/lazy-link.ll
+ /home/b/sanitizer-aarch64-linux/build/build_default/./bin/clang -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta -c -o /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp/main.o /home/b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/orc/TestCases/Generic/lazy-link.ll
warning: overriding the module target triple with aarch64-unknown-linux-gnu [-Woverride-module]
1 warning generated.
RUN: at line 10: /home/b/sanitizer-aarch64-linux/build/build_default/./bin/llvm-jitlink -orc-runtime=/home/b/sanitizer-aarch64-linux/build/build_default/./lib/../lib/clang/20/lib/aarch64-unknown-linux-gnu/liborc_rt.a -noexec -show-linked-files /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp/main.o -lazy /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp/foo.o      -lazy /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp/x.o | FileCheck /home/b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/orc/TestCases/Generic/lazy-link.ll
+ FileCheck /home/b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/orc/TestCases/Generic/lazy-link.ll
+ /home/b/sanitizer-aarch64-linux/build/build_default/./bin/llvm-jitlink -orc-runtime=/home/b/sanitizer-aarch64-linux/build/build_default/./lib/../lib/clang/20/lib/aarch64-unknown-linux-gnu/liborc_rt.a -noexec -show-linked-files /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp/main.o -lazy /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp/foo.o -lazy /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp/x.o
/home/b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/orc/TestCases/Generic/lazy-link.ll:18:14: error: CHECK-DAG: expected string not found in input
; CHECK-DAG: Linking {{.*}}x.o
             ^
<stdin>:2:172: note: scanning from here
Linking /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp/main.o
                                                                                                                                                                           ^
<stdin>:3:1: note: possible intended match here
Linking __orc_reentry_graph_#1
^

Input file: <stdin>
Check file: /home/b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/orc/TestCases/Generic/lazy-link.ll

-dump-input=help explains the following input dump.

Step 14 (test compiler-rt default) failure: test compiler-rt default (failure)
...
[182/186] Generating MSAN_INST_TEST_OBJECTS.msan_test.cpp.aarch64-with-call.o
[183/186] Generating Msan-aarch64-with-call-Test
[184/186] Generating MSAN_INST_TEST_OBJECTS.msan_test.cpp.aarch64.o
[185/186] Generating Msan-aarch64-Test
[185/186] Running compiler_rt regression tests
llvm-lit: /home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/utils/lit/lit/discovery.py:276: warning: input '/home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/interception/Unit' contained no tests
llvm-lit: /home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/utils/lit/lit/discovery.py:276: warning: input '/home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/sanitizer_common/Unit' contained no tests
llvm-lit: /home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 6190 tests, 72 workers --
Testing:  0.. 10.. 20.. 30..
FAIL: ORC-aarch64-linux :: TestCases/Generic/lazy-link.ll (2431 of 6190)
******************** TEST 'ORC-aarch64-linux :: TestCases/Generic/lazy-link.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 6: rm -rf /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp && mkdir -p /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp
+ rm -rf /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp
+ mkdir -p /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp
RUN: at line 7: /home/b/sanitizer-aarch64-linux/build/build_default/./bin/clang    -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta   -c -o /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp/foo.o /home/b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/orc/TestCases/Generic/Inputs/foo-ret-42.ll
+ /home/b/sanitizer-aarch64-linux/build/build_default/./bin/clang -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta -c -o /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp/foo.o /home/b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/orc/TestCases/Generic/Inputs/foo-ret-42.ll
warning: overriding the module target triple with aarch64-unknown-linux-gnu [-Woverride-module]
1 warning generated.
RUN: at line 8: /home/b/sanitizer-aarch64-linux/build/build_default/./bin/clang    -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta   -c -o /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp/x.o /home/b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/orc/TestCases/Generic/Inputs/var-x-42.ll
+ /home/b/sanitizer-aarch64-linux/build/build_default/./bin/clang -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta -c -o /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp/x.o /home/b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/orc/TestCases/Generic/Inputs/var-x-42.ll
warning: overriding the module target triple with aarch64-unknown-linux-gnu [-Woverride-module]
1 warning generated.
RUN: at line 9: /home/b/sanitizer-aarch64-linux/build/build_default/./bin/clang    -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta   -c -o /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp/main.o /home/b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/orc/TestCases/Generic/lazy-link.ll
+ /home/b/sanitizer-aarch64-linux/build/build_default/./bin/clang -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta -c -o /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp/main.o /home/b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/orc/TestCases/Generic/lazy-link.ll
warning: overriding the module target triple with aarch64-unknown-linux-gnu [-Woverride-module]
1 warning generated.
RUN: at line 10: /home/b/sanitizer-aarch64-linux/build/build_default/./bin/llvm-jitlink -orc-runtime=/home/b/sanitizer-aarch64-linux/build/build_default/./lib/../lib/clang/20/lib/aarch64-unknown-linux-gnu/liborc_rt.a -noexec -show-linked-files /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp/main.o -lazy /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp/foo.o      -lazy /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp/x.o | FileCheck /home/b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/orc/TestCases/Generic/lazy-link.ll
+ FileCheck /home/b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/orc/TestCases/Generic/lazy-link.ll
+ /home/b/sanitizer-aarch64-linux/build/build_default/./bin/llvm-jitlink -orc-runtime=/home/b/sanitizer-aarch64-linux/build/build_default/./lib/../lib/clang/20/lib/aarch64-unknown-linux-gnu/liborc_rt.a -noexec -show-linked-files /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp/main.o -lazy /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp/foo.o -lazy /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp/x.o
/home/b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/orc/TestCases/Generic/lazy-link.ll:18:14: error: CHECK-DAG: expected string not found in input
; CHECK-DAG: Linking {{.*}}x.o
             ^
<stdin>:2:172: note: scanning from here
Linking /home/b/sanitizer-aarch64-linux/build/build_default/runtimes/runtimes-bins/compiler-rt/test/orc/AARCH64LinuxConfig/TestCases/Generic/Output/lazy-link.ll.tmp/main.o
                                                                                                                                                                           ^
<stdin>:3:1: note: possible intended match here
Linking __orc_reentry_graph_#1
^

Input file: <stdin>
Check file: /home/b/sanitizer-aarch64-linux/build/llvm-project/compiler-rt/test/orc/TestCases/Generic/lazy-link.ll

-dump-input=help explains the following input dump.


@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 15, 2025

LLVM Buildbot has detected a new failure on builder sanitizer-x86_64-linux-fast running on sanitizer-buildbot3 while building llvm at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/169/builds/7413

Here is the relevant piece of the build log for the reference
Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 88315 tests, 88 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
FAIL: LLVM-Unit :: Analysis/./AnalysisTests/81/171 (74075 of 88315)
******************** TEST 'LLVM-Unit :: Analysis/./AnalysisTests/81/171' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/unittests/Analysis/./AnalysisTests-LLVM-Unit-3662756-81-171.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=171 GTEST_SHARD_INDEX=81 /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/unittests/Analysis/./AnalysisTests
--

Note: This is test shard 82 of 171.
[==========] Running 4 tests from 4 test suites.
[----------] Global test environment set-up.
[----------] 1 test from DDGTest
[ RUN      ] DDGTest.getDependencies
[       OK ] DDGTest.getDependencies (23 ms)
[----------] 1 test from DDGTest (24 ms total)

[----------] 1 test from LoadsTest
[ RUN      ] LoadsTest.IsDerefReadOnlyLoop
/home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Analysis/Loads.cpp:314:56: runtime error: reference binding to null pointer of type 'SmallVectorImpl<const SCEVPredicate *>'
    #0 0x5bd81dd55483 in llvm::isDereferenceableAndAlignedInLoop(llvm::LoadInst*, llvm::Loop*, llvm::ScalarEvolution&, llvm::DominatorTree&, llvm::AssumptionCache*, llvm::SmallVectorImpl<llvm::SCEVPredicate const*>*) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Analysis/Loads.cpp:314:10
    #1 0x5bd81dd5a26c in llvm::isDereferenceableReadOnlyLoop(llvm::Loop*, llvm::ScalarEvolution*, llvm::DominatorTree*, llvm::AssumptionCache*, llvm::SmallVectorImpl<llvm::SCEVPredicate const*>*) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Analysis/Loads.cpp:819:14
    #2 0x5bd81d6f4c8a in LoadsTest_IsDerefReadOnlyLoop_Test::TestBody()::$_0::operator()(llvm::Function*) const /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/unittests/Analysis/LoadsTest.cpp:198:12
    #3 0x5bd81d6f421e in LoadsTest_IsDerefReadOnlyLoop_Test::TestBody() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/unittests/Analysis/LoadsTest.cpp:201:3
    #4 0x5bd820ebb9bc in testing::Test::Run() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/third-party/unittest/googletest/src/gtest.cc:2687:5
    #5 0x5bd820ebe454 in testing::TestInfo::Run() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/third-party/unittest/googletest/src/gtest.cc:2836:11
    #6 0x5bd820ec01ef in testing::TestSuite::Run() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/third-party/unittest/googletest/src/gtest.cc:3015:30
    #7 0x5bd820ee27db in testing::internal::UnitTestImpl::RunAllTests() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/third-party/unittest/googletest/src/gtest.cc:5920:44
    #8 0x5bd820ee0d20 in testing::UnitTest::Run() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/third-party/unittest/googletest/src/gtest.cc:5484:10
    #9 0x5bd820e81dbe in RUN_ALL_TESTS /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/third-party/unittest/googletest/include/gtest/gtest.h:2317:73
    #10 0x5bd820e81dbe in main /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/third-party/unittest/UnitTestMain/TestMain.cpp:55:10
    #11 0x74616b42a3b7  (/lib/x86_64-linux-gnu/libc.so.6+0x2a3b7) (BuildId: 5f3f024b472f38389da3a2f567b3d0eaa8835ca2)
    #12 0x74616b42a47a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a47a) (BuildId: 5f3f024b472f38389da3a2f567b3d0eaa8835ca2)
    #13 0x5bd81d3664e4 in _start (/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/unittests/Analysis/AnalysisTests+0x3b4d4e4)

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Analysis/Loads.cpp:314:56 

--
exit: 1
--
shard JSON output does not exist: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/unittests/Analysis/./AnalysisTests-LLVM-Unit-3662756-81-171.json
Step 10 (stage2/asan_ubsan check) failure: stage2/asan_ubsan check (failure)
...
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 88315 tests, 88 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
FAIL: LLVM-Unit :: Analysis/./AnalysisTests/81/171 (74075 of 88315)
******************** TEST 'LLVM-Unit :: Analysis/./AnalysisTests/81/171' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/unittests/Analysis/./AnalysisTests-LLVM-Unit-3662756-81-171.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=171 GTEST_SHARD_INDEX=81 /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/unittests/Analysis/./AnalysisTests
--

Note: This is test shard 82 of 171.
[==========] Running 4 tests from 4 test suites.
[----------] Global test environment set-up.
[----------] 1 test from DDGTest
[ RUN      ] DDGTest.getDependencies
[       OK ] DDGTest.getDependencies (23 ms)
[----------] 1 test from DDGTest (24 ms total)

[----------] 1 test from LoadsTest
[ RUN      ] LoadsTest.IsDerefReadOnlyLoop
/home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Analysis/Loads.cpp:314:56: runtime error: reference binding to null pointer of type 'SmallVectorImpl<const SCEVPredicate *>'
    #0 0x5bd81dd55483 in llvm::isDereferenceableAndAlignedInLoop(llvm::LoadInst*, llvm::Loop*, llvm::ScalarEvolution&, llvm::DominatorTree&, llvm::AssumptionCache*, llvm::SmallVectorImpl<llvm::SCEVPredicate const*>*) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Analysis/Loads.cpp:314:10
    #1 0x5bd81dd5a26c in llvm::isDereferenceableReadOnlyLoop(llvm::Loop*, llvm::ScalarEvolution*, llvm::DominatorTree*, llvm::AssumptionCache*, llvm::SmallVectorImpl<llvm::SCEVPredicate const*>*) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Analysis/Loads.cpp:819:14
    #2 0x5bd81d6f4c8a in LoadsTest_IsDerefReadOnlyLoop_Test::TestBody()::$_0::operator()(llvm::Function*) const /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/unittests/Analysis/LoadsTest.cpp:198:12
    #3 0x5bd81d6f421e in LoadsTest_IsDerefReadOnlyLoop_Test::TestBody() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/unittests/Analysis/LoadsTest.cpp:201:3
    #4 0x5bd820ebb9bc in testing::Test::Run() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/third-party/unittest/googletest/src/gtest.cc:2687:5
    #5 0x5bd820ebe454 in testing::TestInfo::Run() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/third-party/unittest/googletest/src/gtest.cc:2836:11
    #6 0x5bd820ec01ef in testing::TestSuite::Run() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/third-party/unittest/googletest/src/gtest.cc:3015:30
    #7 0x5bd820ee27db in testing::internal::UnitTestImpl::RunAllTests() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/third-party/unittest/googletest/src/gtest.cc:5920:44
    #8 0x5bd820ee0d20 in testing::UnitTest::Run() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/third-party/unittest/googletest/src/gtest.cc:5484:10
    #9 0x5bd820e81dbe in RUN_ALL_TESTS /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/third-party/unittest/googletest/include/gtest/gtest.h:2317:73
    #10 0x5bd820e81dbe in main /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/third-party/unittest/UnitTestMain/TestMain.cpp:55:10
    #11 0x74616b42a3b7  (/lib/x86_64-linux-gnu/libc.so.6+0x2a3b7) (BuildId: 5f3f024b472f38389da3a2f567b3d0eaa8835ca2)
    #12 0x74616b42a47a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a47a) (BuildId: 5f3f024b472f38389da3a2f567b3d0eaa8835ca2)
    #13 0x5bd81d3664e4 in _start (/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/unittests/Analysis/AnalysisTests+0x3b4d4e4)

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Analysis/Loads.cpp:314:56 

--
exit: 1
--
shard JSON output does not exist: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/unittests/Analysis/./AnalysisTests-LLVM-Unit-3662756-81-171.json

@david-arm
Copy link
Contributor Author

Looks like this is causing an assert failure for build https://lab.llvm.org/buildbot/#/builders/169/builds/7413. I'll revert asap.

david-arm added a commit to david-arm/llvm-project that referenced this pull request Jan 15, 2025
david-arm added a commit that referenced this pull request Jan 15, 2025
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 15, 2025

LLVM Buildbot has detected a new failure on builder sanitizer-aarch64-linux-bootstrap-ubsan running on sanitizer-buildbot9 while building llvm at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/85/builds/4376

Here is the relevant piece of the build log for the reference
Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld.lld: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/ld.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 85839 tests, 72 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 
FAIL: LLVM-Unit :: Analysis/./AnalysisTests/80/86 (77441 of 85839)
******************** TEST 'LLVM-Unit :: Analysis/./AnalysisTests/80/86' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:/home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm_build_ubsan/unittests/Analysis/./AnalysisTests-LLVM-Unit-1835745-80-86.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=86 GTEST_SHARD_INDEX=80 /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm_build_ubsan/unittests/Analysis/./AnalysisTests
--

Note: This is test shard 81 of 86.
[==========] Running 7 tests from 7 test suites.
[----------] Global test environment set-up.
[----------] 1 test from CtxProfAnalysisTest
[ RUN      ] CtxProfAnalysisTest.GetBBIDTest
[       OK ] CtxProfAnalysisTest.GetBBIDTest (0 ms)
[----------] 1 test from CtxProfAnalysisTest (0 ms total)

[----------] 1 test from IRInstructionMapper
[ RUN      ] IRInstructionMapper.ExceptionHandlingExceptionCodeIllegal
[       OK ] IRInstructionMapper.ExceptionHandlingExceptionCodeIllegal (0 ms)
[----------] 1 test from IRInstructionMapper (0 ms total)

[----------] 1 test from LoadsTest
[ RUN      ] LoadsTest.IsDerefReadOnlyLoop
/home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/lib/Analysis/Loads.cpp:314:56: runtime error: reference binding to null pointer of type 'SmallVectorImpl<const SCEVPredicate *>'
    #0 0xc996fa29a9a8 in llvm::isDereferenceableAndAlignedInLoop(llvm::LoadInst*, llvm::Loop*, llvm::ScalarEvolution&, llvm::DominatorTree&, llvm::AssumptionCache*, llvm::SmallVectorImpl<llvm::SCEVPredicate const*>*) /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/lib/Analysis/Loads.cpp:314:10
    #1 0xc996fa29ce00 in llvm::isDereferenceableReadOnlyLoop(llvm::Loop*, llvm::ScalarEvolution*, llvm::DominatorTree*, llvm::AssumptionCache*, llvm::SmallVectorImpl<llvm::SCEVPredicate const*>*) /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/lib/Analysis/Loads.cpp:819:14
    #2 0xc996f9f77d30 in LoadsTest_IsDerefReadOnlyLoop_Test::TestBody()::$_0::operator()(llvm::Function*) const /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/unittests/Analysis/LoadsTest.cpp:198:12
    #3 0xc996f9f7793c in LoadsTest_IsDerefReadOnlyLoop_Test::TestBody() /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/unittests/Analysis/LoadsTest.cpp:201:3
    #4 0xc996fba8e444 in testing::Test::Run() /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/third-party/unittest/googletest/src/gtest.cc:2687:5
    #5 0xc996fba8fc3c in testing::TestInfo::Run() /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/third-party/unittest/googletest/src/gtest.cc:2836:11
    #6 0xc996fba90f90 in testing::TestSuite::Run() /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/third-party/unittest/googletest/src/gtest.cc:3015:30
    #7 0xc996fbaa1aa0 in testing::internal::UnitTestImpl::RunAllTests() /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/third-party/unittest/googletest/src/gtest.cc:5920:44
    #8 0xc996fbaa1058 in testing::UnitTest::Run() /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/third-party/unittest/googletest/src/gtest.cc:5484:10
    #9 0xc996fba7304c in RUN_ALL_TESTS /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/third-party/unittest/googletest/include/gtest/gtest.h:2317:73
    #10 0xc996fba7304c in main /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/third-party/unittest/UnitTestMain/TestMain.cpp:55:10
    #11 0xe701662d84c0  (/lib/aarch64-linux-gnu/libc.so.6+0x284c0) (BuildId: 32fa4d6f3a8d5f430bdb7af2eb779470cd5ec7c2)
    #12 0xe701662d8594 in __libc_start_main (/lib/aarch64-linux-gnu/libc.so.6+0x28594) (BuildId: 32fa4d6f3a8d5f430bdb7af2eb779470cd5ec7c2)
    #13 0xc996f9e2a62c in _start (/home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm_build_ubsan/unittests/Analysis/AnalysisTests+0x2aba62c)

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/lib/Analysis/Loads.cpp:314:56 
Step 11 (stage2/ubsan check) failure: stage2/ubsan check (failure)
...
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld.lld: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/ld.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 85839 tests, 72 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 
FAIL: LLVM-Unit :: Analysis/./AnalysisTests/80/86 (77441 of 85839)
******************** TEST 'LLVM-Unit :: Analysis/./AnalysisTests/80/86' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:/home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm_build_ubsan/unittests/Analysis/./AnalysisTests-LLVM-Unit-1835745-80-86.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=86 GTEST_SHARD_INDEX=80 /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm_build_ubsan/unittests/Analysis/./AnalysisTests
--

Note: This is test shard 81 of 86.
[==========] Running 7 tests from 7 test suites.
[----------] Global test environment set-up.
[----------] 1 test from CtxProfAnalysisTest
[ RUN      ] CtxProfAnalysisTest.GetBBIDTest
[       OK ] CtxProfAnalysisTest.GetBBIDTest (0 ms)
[----------] 1 test from CtxProfAnalysisTest (0 ms total)

[----------] 1 test from IRInstructionMapper
[ RUN      ] IRInstructionMapper.ExceptionHandlingExceptionCodeIllegal
[       OK ] IRInstructionMapper.ExceptionHandlingExceptionCodeIllegal (0 ms)
[----------] 1 test from IRInstructionMapper (0 ms total)

[----------] 1 test from LoadsTest
[ RUN      ] LoadsTest.IsDerefReadOnlyLoop
/home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/lib/Analysis/Loads.cpp:314:56: runtime error: reference binding to null pointer of type 'SmallVectorImpl<const SCEVPredicate *>'
    #0 0xc996fa29a9a8 in llvm::isDereferenceableAndAlignedInLoop(llvm::LoadInst*, llvm::Loop*, llvm::ScalarEvolution&, llvm::DominatorTree&, llvm::AssumptionCache*, llvm::SmallVectorImpl<llvm::SCEVPredicate const*>*) /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/lib/Analysis/Loads.cpp:314:10
    #1 0xc996fa29ce00 in llvm::isDereferenceableReadOnlyLoop(llvm::Loop*, llvm::ScalarEvolution*, llvm::DominatorTree*, llvm::AssumptionCache*, llvm::SmallVectorImpl<llvm::SCEVPredicate const*>*) /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/lib/Analysis/Loads.cpp:819:14
    #2 0xc996f9f77d30 in LoadsTest_IsDerefReadOnlyLoop_Test::TestBody()::$_0::operator()(llvm::Function*) const /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/unittests/Analysis/LoadsTest.cpp:198:12
    #3 0xc996f9f7793c in LoadsTest_IsDerefReadOnlyLoop_Test::TestBody() /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/unittests/Analysis/LoadsTest.cpp:201:3
    #4 0xc996fba8e444 in testing::Test::Run() /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/third-party/unittest/googletest/src/gtest.cc:2687:5
    #5 0xc996fba8fc3c in testing::TestInfo::Run() /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/third-party/unittest/googletest/src/gtest.cc:2836:11
    #6 0xc996fba90f90 in testing::TestSuite::Run() /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/third-party/unittest/googletest/src/gtest.cc:3015:30
    #7 0xc996fbaa1aa0 in testing::internal::UnitTestImpl::RunAllTests() /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/third-party/unittest/googletest/src/gtest.cc:5920:44
    #8 0xc996fbaa1058 in testing::UnitTest::Run() /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/third-party/unittest/googletest/src/gtest.cc:5484:10
    #9 0xc996fba7304c in RUN_ALL_TESTS /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/third-party/unittest/googletest/include/gtest/gtest.h:2317:73
    #10 0xc996fba7304c in main /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/third-party/unittest/UnitTestMain/TestMain.cpp:55:10
    #11 0xe701662d84c0  (/lib/aarch64-linux-gnu/libc.so.6+0x284c0) (BuildId: 32fa4d6f3a8d5f430bdb7af2eb779470cd5ec7c2)
    #12 0xe701662d8594 in __libc_start_main (/lib/aarch64-linux-gnu/libc.so.6+0x28594) (BuildId: 32fa4d6f3a8d5f430bdb7af2eb779470cd5ec7c2)
    #13 0xc996f9e2a62c in _start (/home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm_build_ubsan/unittests/Analysis/AnalysisTests+0x2aba62c)

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/b/sanitizer-aarch64-linux-bootstrap-ubsan/build/llvm-project/llvm/lib/Analysis/Loads.cpp:314:56 

david-arm added a commit to david-arm/llvm-project that referenced this pull request Jan 20, 2025
…eableAndAlignedInLoop llvm#96752"

The last attempt failed a sanitiser build because we were
creating a reference to a null Predicates pointer in
isDereferenceableAndAlignedInLoop. This was exposed by
the unit test IsDerefReadOnlyLoop in
unittests/Analysis/LoadsTest.cpp. I fixed this by falling
back on getConstantMaxBackedgeTakenCount if Predicates is
null.
david-arm added a commit to david-arm/llvm-project that referenced this pull request Jan 27, 2025
…eableAndAlignedInLoop llvm#96752"

The last attempt failed a sanitiser build because we were
creating a reference to a null Predicates pointer in
isDereferenceableAndAlignedInLoop. This was exposed by
the unit test IsDerefReadOnlyLoop in
unittests/Analysis/LoadsTest.cpp. I fixed this by falling
back on getConstantMaxBackedgeTakenCount if Predicates is
null.
david-arm added a commit that referenced this pull request Jan 27, 2025
…eableAndAlignedInLoop #96752" (#123616)

The last attempt failed a sanitiser build because we were
creating a reference to a null Predicates pointer in
isDereferenceableAndAlignedInLoop. This was exposed by
the unit test IsDerefReadOnlyLoop in
unittests/Analysis/LoadsTest.cpp. I fixed this by falling
back on getConstantMaxBackedgeTakenCount if Predicates is
null - see line 316 in llvm/lib/Analysis/Loads.cpp. There
are no other changes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants