Skip to content

[VPlan] Set debug location for VPReduction/VPWidenIntrinsicRecipe. #120054

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

ElvisWang123
Copy link
Contributor

@ElvisWang123 ElvisWang123 commented Dec 16, 2024

This patch add missing debug location for VPReduction/VPWidenIntrinsicRecipe.

@llvmbot
Copy link
Member

llvmbot commented Dec 16, 2024

@llvm/pr-subscribers-vectorizers

Author: Elvis Wang (ElvisWang123)

Changes

This patch add missing debug location for VPReduction/VPWidenIntrinsicRecipe and align the debug location for
VPWidenPHIRecipe to its scalar counterpart.

Depends on #120053.


Full diff: https://github.com/llvm/llvm-project/pull/120054.diff

5 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/VPlan.h (+5-3)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp (+2)
  • (added) llvm/test/Transforms/LoopVectorize/RISCV/preserve-dbg-loc.ll (+36)
  • (modified) llvm/test/Transforms/LoopVectorize/dbg-outer-loop-vect.ll (+2-2)
  • (added) llvm/test/Transforms/LoopVectorize/preserve-dbg-loc.ll (+33)
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h b/llvm/lib/Transforms/Vectorize/VPlan.h
index 8794517b777f3b..6ebf2ec36e3789 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -1662,7 +1662,7 @@ class VPWidenIntrinsicRecipe : public VPRecipeWithIRFlags {
   VPWidenIntrinsicRecipe(Intrinsic::ID VectorIntrinsicID,
                          ArrayRef<VPValue *> CallArguments, Type *Ty,
                          DebugLoc DL = {})
-      : VPRecipeWithIRFlags(VPDef::VPWidenIntrinsicSC, CallArguments),
+      : VPRecipeWithIRFlags(VPDef::VPWidenIntrinsicSC, CallArguments, DL),
         VectorIntrinsicID(VectorIntrinsicID), ResultTy(Ty) {
     LLVMContext &Ctx = Ty->getContext();
     AttributeList Attrs = Intrinsic::getAttributes(Ctx, VectorIntrinsicID);
@@ -2288,7 +2288,8 @@ class VPWidenPHIRecipe : public VPSingleDefRecipe {
 public:
   /// Create a new VPWidenPHIRecipe for \p Phi with start value \p Start.
   VPWidenPHIRecipe(PHINode *Phi, VPValue *Start = nullptr)
-      : VPSingleDefRecipe(VPDef::VPWidenPHISC, ArrayRef<VPValue *>(), Phi) {
+      : VPSingleDefRecipe(VPDef::VPWidenPHISC, ArrayRef<VPValue *>(), Phi,
+                          Phi->getDebugLoc()) {
     if (Start)
       addOperand(Start);
   }
@@ -2590,7 +2591,8 @@ class VPReductionRecipe : public VPSingleDefRecipe {
   VPReductionRecipe(const unsigned char SC, const RecurrenceDescriptor &R,
                     Instruction *I, ArrayRef<VPValue *> Operands,
                     VPValue *CondOp, bool IsOrdered)
-      : VPSingleDefRecipe(SC, Operands, I), RdxDesc(R), IsOrdered(IsOrdered) {
+      : VPSingleDefRecipe(SC, Operands, I, I->getDebugLoc()), RdxDesc(R),
+        IsOrdered(IsOrdered) {
     if (CondOp) {
       IsConditional = true;
       addOperand(CondOp);
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index e882368544e815..4179a9eebab3b7 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -2108,6 +2108,7 @@ void VPReductionRecipe::execute(VPTransformState &State) {
   // Propagate the fast-math flags carried by the underlying instruction.
   IRBuilderBase::FastMathFlagGuard FMFGuard(State.Builder);
   State.Builder.setFastMathFlags(RdxDesc.getFastMathFlags());
+  State.setDebugLocFrom(getDebugLoc());
   Value *NewVecOp = State.get(getVecOp());
   if (VPValue *Cond = getCondOp()) {
     Value *NewCond = State.get(Cond, State.VF.isScalar());
@@ -3424,6 +3425,7 @@ void VPWidenPHIRecipe::execute(VPTransformState &State) {
   assert(EnableVPlanNativePath &&
          "Non-native vplans are not expected to have VPWidenPHIRecipes.");
 
+  State.setDebugLocFrom(getDebugLoc());
   Value *Op0 = State.get(getOperand(0));
   Type *VecTy = Op0->getType();
   Value *VecPhi = State.Builder.CreatePHI(VecTy, 2, "vec.phi");
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/preserve-dbg-loc.ll b/llvm/test/Transforms/LoopVectorize/RISCV/preserve-dbg-loc.ll
new file mode 100644
index 00000000000000..9998c44e6e3907
--- /dev/null
+++ b/llvm/test/Transforms/LoopVectorize/RISCV/preserve-dbg-loc.ll
@@ -0,0 +1,36 @@
+; RUN: opt -passes=debugify,loop-vectorize \
+; RUN: -force-tail-folding-style=data-with-evl \
+; RUN: -prefer-predicate-over-epilogue=predicate-dont-vectorize \
+; RUN: -mtriple=riscv64 -mattr=+v -riscv-v-vector-bits-max=128 -S < %s 2>&1 | FileCheck --check-prefix=DEBUGLOC %s
+
+; Testing the debug locations of the generated vector intstruction are same as
+; their scalar instruction.
+
+; DEBUGLOC-LABEL: define void @vp_select(
+define void @vp_select(ptr noalias %a, ptr noalias %b, ptr noalias %c, i64 %N) {
+; DEBUGLOC: vector.body:
+; DEBUGLOC:   %[[VPSel:[0-9]+]] = call <vscale x 4 x i32> @llvm.vp.select.nxv4i32(<vscale x 4 x i1> %15, <vscale x 4 x i32> %vp.op.load1, <vscale x 4 x i32> %vp.op, i32 %9), !dbg ![[SelLoc:[0-9]+]]
+; DEBUGLOC: for.body:
+; DEBUGLOC:   %cond.p = select i1 %cmp4, i32 %22, i32 %23, !dbg ![[SelLoc]]
+ entry:
+   br label %for.body
+
+ for.body:
+   %indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %entry ]
+   %arrayidx = getelementptr inbounds i32, ptr %b, i64 %indvars.iv
+   %0 = load i32, ptr %arrayidx, align 4
+   %arrayidx3 = getelementptr inbounds i32, ptr %c, i64 %indvars.iv
+   %1 = load i32, ptr %arrayidx3, align 4
+   %cmp4 = icmp sgt i32 %0, %1
+   %2 = sub i32 0, %1
+   %cond.p = select i1 %cmp4, i32 %1, i32 %2
+   %cond = add i32 %cond.p, %0
+   %arrayidx15 = getelementptr inbounds i32, ptr %a, i64 %indvars.iv
+   store i32 %cond, ptr %arrayidx15, align 4
+   %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
+   %exitcond.not = icmp eq i64 %indvars.iv.next, %N
+   br i1 %exitcond.not, label %exit, label %for.body
+
+ exit:
+   ret void
+ }
diff --git a/llvm/test/Transforms/LoopVectorize/dbg-outer-loop-vect.ll b/llvm/test/Transforms/LoopVectorize/dbg-outer-loop-vect.ll
index 66aceab9fb27c8..44afa34100c299 100644
--- a/llvm/test/Transforms/LoopVectorize/dbg-outer-loop-vect.ll
+++ b/llvm/test/Transforms/LoopVectorize/dbg-outer-loop-vect.ll
@@ -15,8 +15,8 @@ define void @foo(ptr %h) !dbg !4 {
 ; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[FOR_COND_CLEANUP32:%.*]] ]
 ; CHECK-NEXT:    br label [[FOR_COND5_PREHEADER1:%.*]], !dbg [[DBG21]]
 ; CHECK:       for.cond5.preheader1:
-; CHECK-NEXT:    [[VEC_PHI:%.*]] = phi <4 x i64> [ zeroinitializer, [[VECTOR_BODY]] ], [ [[TMP4:%.*]], [[FOR_COND5_PREHEADER1]] ], !dbg [[DBG21]]
-; CHECK-NEXT:    [[TMP0:%.*]] = getelementptr i32, ptr [[H]], <4 x i64> [[VEC_PHI]], !dbg [[DBG21]]
+; CHECK-NEXT:    [[VEC_PHI:%.*]] = phi <4 x i64> [ zeroinitializer, [[VECTOR_BODY]] ], [ [[TMP4:%.*]], [[FOR_COND5_PREHEADER1]] ]
+; CHECK-NEXT:    [[TMP0:%.*]] = getelementptr i32, ptr [[H]], <4 x i64> [[VEC_PHI]]
 ; CHECK-NEXT:    call void @llvm.masked.scatter.v4i32.v4p0(<4 x i32> zeroinitializer, <4 x ptr> [[TMP0]], i32 4, <4 x i1> splat (i1 true)), !dbg [[DBG22:![0-9]+]]
 ; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr i32, <4 x ptr> [[TMP0]], i64 1, !dbg [[DBG22]]
 ; CHECK-NEXT:    call void @llvm.masked.scatter.v4i32.v4p0(<4 x i32> splat (i32 1), <4 x ptr> [[TMP1]], i32 4, <4 x i1> splat (i1 true)), !dbg [[DBG22]]
diff --git a/llvm/test/Transforms/LoopVectorize/preserve-dbg-loc.ll b/llvm/test/Transforms/LoopVectorize/preserve-dbg-loc.ll
new file mode 100644
index 00000000000000..12e8b92523d31f
--- /dev/null
+++ b/llvm/test/Transforms/LoopVectorize/preserve-dbg-loc.ll
@@ -0,0 +1,33 @@
+; RUN: opt < %s -passes=debugify,loop-vectorize -force-vector-interleave=1 -force-vector-width=4 -prefer-inloop-reductions -S | FileCheck %s -check-prefix DEBUGLOC
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+
+; Testing the debug locations of the generated vector intstruction are same as
+; their scalar instruction.
+
+; DEBUGLOC-LABEL: define i32 @reduction_sum(
+define i32 @reduction_sum(ptr noalias nocapture %A, ptr noalias nocapture %B) {
+; DEBUGLOC: vector.body:
+; DEBUGLOC:   %[[VecLoad:.*]] = load <4 x i32>, ptr %2, align 4, !dbg ![[LoadLoc0:[0-9]+]]
+; DEBUGLOC:   %[[VecRed:.*]] = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %wide.load), !dbg ![[RedLoc0:[0-9]+]]
+; DEBUGLOC: .lr.ph:
+; DEBUGLOC:   %l3 = load i32, ptr %l2, align 4, !dbg ![[LoadLoc0]]
+; DEBUGLOC:   %l7 = add i32 %sum.02, %l3, !dbg ![[RedLoc0]]
+entry:
+  br label %.lr.ph
+
+.lr.ph:                                           ; preds = %entry, %.lr.ph
+  %indvars.iv = phi i64 [ %indvars.iv.next, %.lr.ph ], [ 0, %entry ]
+  %sum.02 = phi i32 [ %l7, %.lr.ph ], [ 0, %entry ]
+  %l2 = getelementptr inbounds i32, ptr %A, i64 %indvars.iv
+  %l3 = load i32, ptr %l2, align 4
+  %l6 = trunc i64 %indvars.iv to i32
+  %l7 = add i32 %sum.02, %l3
+  %indvars.iv.next = add i64 %indvars.iv, 1
+  %lftr.wideiv = trunc i64 %indvars.iv.next to i32
+  %exitcond = icmp eq i32 %lftr.wideiv, 256
+  br i1 %exitcond, label %._crit_edge, label %.lr.ph
+
+._crit_edge:                                      ; preds = %.lr.ph
+  %sum.0.lcssa = phi i32 [ %l7, %.lr.ph ]
+  ret i32 %sum.0.lcssa
+}

@llvmbot
Copy link
Member

llvmbot commented Dec 16, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Elvis Wang (ElvisWang123)

Changes

This patch add missing debug location for VPReduction/VPWidenIntrinsicRecipe and align the debug location for
VPWidenPHIRecipe to its scalar counterpart.

Depends on #120053.


Full diff: https://github.com/llvm/llvm-project/pull/120054.diff

5 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/VPlan.h (+5-3)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp (+2)
  • (added) llvm/test/Transforms/LoopVectorize/RISCV/preserve-dbg-loc.ll (+36)
  • (modified) llvm/test/Transforms/LoopVectorize/dbg-outer-loop-vect.ll (+2-2)
  • (added) llvm/test/Transforms/LoopVectorize/preserve-dbg-loc.ll (+33)
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h b/llvm/lib/Transforms/Vectorize/VPlan.h
index 8794517b777f3b..6ebf2ec36e3789 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -1662,7 +1662,7 @@ class VPWidenIntrinsicRecipe : public VPRecipeWithIRFlags {
   VPWidenIntrinsicRecipe(Intrinsic::ID VectorIntrinsicID,
                          ArrayRef<VPValue *> CallArguments, Type *Ty,
                          DebugLoc DL = {})
-      : VPRecipeWithIRFlags(VPDef::VPWidenIntrinsicSC, CallArguments),
+      : VPRecipeWithIRFlags(VPDef::VPWidenIntrinsicSC, CallArguments, DL),
         VectorIntrinsicID(VectorIntrinsicID), ResultTy(Ty) {
     LLVMContext &Ctx = Ty->getContext();
     AttributeList Attrs = Intrinsic::getAttributes(Ctx, VectorIntrinsicID);
@@ -2288,7 +2288,8 @@ class VPWidenPHIRecipe : public VPSingleDefRecipe {
 public:
   /// Create a new VPWidenPHIRecipe for \p Phi with start value \p Start.
   VPWidenPHIRecipe(PHINode *Phi, VPValue *Start = nullptr)
-      : VPSingleDefRecipe(VPDef::VPWidenPHISC, ArrayRef<VPValue *>(), Phi) {
+      : VPSingleDefRecipe(VPDef::VPWidenPHISC, ArrayRef<VPValue *>(), Phi,
+                          Phi->getDebugLoc()) {
     if (Start)
       addOperand(Start);
   }
@@ -2590,7 +2591,8 @@ class VPReductionRecipe : public VPSingleDefRecipe {
   VPReductionRecipe(const unsigned char SC, const RecurrenceDescriptor &R,
                     Instruction *I, ArrayRef<VPValue *> Operands,
                     VPValue *CondOp, bool IsOrdered)
-      : VPSingleDefRecipe(SC, Operands, I), RdxDesc(R), IsOrdered(IsOrdered) {
+      : VPSingleDefRecipe(SC, Operands, I, I->getDebugLoc()), RdxDesc(R),
+        IsOrdered(IsOrdered) {
     if (CondOp) {
       IsConditional = true;
       addOperand(CondOp);
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index e882368544e815..4179a9eebab3b7 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -2108,6 +2108,7 @@ void VPReductionRecipe::execute(VPTransformState &State) {
   // Propagate the fast-math flags carried by the underlying instruction.
   IRBuilderBase::FastMathFlagGuard FMFGuard(State.Builder);
   State.Builder.setFastMathFlags(RdxDesc.getFastMathFlags());
+  State.setDebugLocFrom(getDebugLoc());
   Value *NewVecOp = State.get(getVecOp());
   if (VPValue *Cond = getCondOp()) {
     Value *NewCond = State.get(Cond, State.VF.isScalar());
@@ -3424,6 +3425,7 @@ void VPWidenPHIRecipe::execute(VPTransformState &State) {
   assert(EnableVPlanNativePath &&
          "Non-native vplans are not expected to have VPWidenPHIRecipes.");
 
+  State.setDebugLocFrom(getDebugLoc());
   Value *Op0 = State.get(getOperand(0));
   Type *VecTy = Op0->getType();
   Value *VecPhi = State.Builder.CreatePHI(VecTy, 2, "vec.phi");
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/preserve-dbg-loc.ll b/llvm/test/Transforms/LoopVectorize/RISCV/preserve-dbg-loc.ll
new file mode 100644
index 00000000000000..9998c44e6e3907
--- /dev/null
+++ b/llvm/test/Transforms/LoopVectorize/RISCV/preserve-dbg-loc.ll
@@ -0,0 +1,36 @@
+; RUN: opt -passes=debugify,loop-vectorize \
+; RUN: -force-tail-folding-style=data-with-evl \
+; RUN: -prefer-predicate-over-epilogue=predicate-dont-vectorize \
+; RUN: -mtriple=riscv64 -mattr=+v -riscv-v-vector-bits-max=128 -S < %s 2>&1 | FileCheck --check-prefix=DEBUGLOC %s
+
+; Testing the debug locations of the generated vector intstruction are same as
+; their scalar instruction.
+
+; DEBUGLOC-LABEL: define void @vp_select(
+define void @vp_select(ptr noalias %a, ptr noalias %b, ptr noalias %c, i64 %N) {
+; DEBUGLOC: vector.body:
+; DEBUGLOC:   %[[VPSel:[0-9]+]] = call <vscale x 4 x i32> @llvm.vp.select.nxv4i32(<vscale x 4 x i1> %15, <vscale x 4 x i32> %vp.op.load1, <vscale x 4 x i32> %vp.op, i32 %9), !dbg ![[SelLoc:[0-9]+]]
+; DEBUGLOC: for.body:
+; DEBUGLOC:   %cond.p = select i1 %cmp4, i32 %22, i32 %23, !dbg ![[SelLoc]]
+ entry:
+   br label %for.body
+
+ for.body:
+   %indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %entry ]
+   %arrayidx = getelementptr inbounds i32, ptr %b, i64 %indvars.iv
+   %0 = load i32, ptr %arrayidx, align 4
+   %arrayidx3 = getelementptr inbounds i32, ptr %c, i64 %indvars.iv
+   %1 = load i32, ptr %arrayidx3, align 4
+   %cmp4 = icmp sgt i32 %0, %1
+   %2 = sub i32 0, %1
+   %cond.p = select i1 %cmp4, i32 %1, i32 %2
+   %cond = add i32 %cond.p, %0
+   %arrayidx15 = getelementptr inbounds i32, ptr %a, i64 %indvars.iv
+   store i32 %cond, ptr %arrayidx15, align 4
+   %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
+   %exitcond.not = icmp eq i64 %indvars.iv.next, %N
+   br i1 %exitcond.not, label %exit, label %for.body
+
+ exit:
+   ret void
+ }
diff --git a/llvm/test/Transforms/LoopVectorize/dbg-outer-loop-vect.ll b/llvm/test/Transforms/LoopVectorize/dbg-outer-loop-vect.ll
index 66aceab9fb27c8..44afa34100c299 100644
--- a/llvm/test/Transforms/LoopVectorize/dbg-outer-loop-vect.ll
+++ b/llvm/test/Transforms/LoopVectorize/dbg-outer-loop-vect.ll
@@ -15,8 +15,8 @@ define void @foo(ptr %h) !dbg !4 {
 ; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[FOR_COND_CLEANUP32:%.*]] ]
 ; CHECK-NEXT:    br label [[FOR_COND5_PREHEADER1:%.*]], !dbg [[DBG21]]
 ; CHECK:       for.cond5.preheader1:
-; CHECK-NEXT:    [[VEC_PHI:%.*]] = phi <4 x i64> [ zeroinitializer, [[VECTOR_BODY]] ], [ [[TMP4:%.*]], [[FOR_COND5_PREHEADER1]] ], !dbg [[DBG21]]
-; CHECK-NEXT:    [[TMP0:%.*]] = getelementptr i32, ptr [[H]], <4 x i64> [[VEC_PHI]], !dbg [[DBG21]]
+; CHECK-NEXT:    [[VEC_PHI:%.*]] = phi <4 x i64> [ zeroinitializer, [[VECTOR_BODY]] ], [ [[TMP4:%.*]], [[FOR_COND5_PREHEADER1]] ]
+; CHECK-NEXT:    [[TMP0:%.*]] = getelementptr i32, ptr [[H]], <4 x i64> [[VEC_PHI]]
 ; CHECK-NEXT:    call void @llvm.masked.scatter.v4i32.v4p0(<4 x i32> zeroinitializer, <4 x ptr> [[TMP0]], i32 4, <4 x i1> splat (i1 true)), !dbg [[DBG22:![0-9]+]]
 ; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr i32, <4 x ptr> [[TMP0]], i64 1, !dbg [[DBG22]]
 ; CHECK-NEXT:    call void @llvm.masked.scatter.v4i32.v4p0(<4 x i32> splat (i32 1), <4 x ptr> [[TMP1]], i32 4, <4 x i1> splat (i1 true)), !dbg [[DBG22]]
diff --git a/llvm/test/Transforms/LoopVectorize/preserve-dbg-loc.ll b/llvm/test/Transforms/LoopVectorize/preserve-dbg-loc.ll
new file mode 100644
index 00000000000000..12e8b92523d31f
--- /dev/null
+++ b/llvm/test/Transforms/LoopVectorize/preserve-dbg-loc.ll
@@ -0,0 +1,33 @@
+; RUN: opt < %s -passes=debugify,loop-vectorize -force-vector-interleave=1 -force-vector-width=4 -prefer-inloop-reductions -S | FileCheck %s -check-prefix DEBUGLOC
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+
+; Testing the debug locations of the generated vector intstruction are same as
+; their scalar instruction.
+
+; DEBUGLOC-LABEL: define i32 @reduction_sum(
+define i32 @reduction_sum(ptr noalias nocapture %A, ptr noalias nocapture %B) {
+; DEBUGLOC: vector.body:
+; DEBUGLOC:   %[[VecLoad:.*]] = load <4 x i32>, ptr %2, align 4, !dbg ![[LoadLoc0:[0-9]+]]
+; DEBUGLOC:   %[[VecRed:.*]] = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %wide.load), !dbg ![[RedLoc0:[0-9]+]]
+; DEBUGLOC: .lr.ph:
+; DEBUGLOC:   %l3 = load i32, ptr %l2, align 4, !dbg ![[LoadLoc0]]
+; DEBUGLOC:   %l7 = add i32 %sum.02, %l3, !dbg ![[RedLoc0]]
+entry:
+  br label %.lr.ph
+
+.lr.ph:                                           ; preds = %entry, %.lr.ph
+  %indvars.iv = phi i64 [ %indvars.iv.next, %.lr.ph ], [ 0, %entry ]
+  %sum.02 = phi i32 [ %l7, %.lr.ph ], [ 0, %entry ]
+  %l2 = getelementptr inbounds i32, ptr %A, i64 %indvars.iv
+  %l3 = load i32, ptr %l2, align 4
+  %l6 = trunc i64 %indvars.iv to i32
+  %l7 = add i32 %sum.02, %l3
+  %indvars.iv.next = add i64 %indvars.iv, 1
+  %lftr.wideiv = trunc i64 %indvars.iv.next to i32
+  %exitcond = icmp eq i32 %lftr.wideiv, 256
+  br i1 %exitcond, label %._crit_edge, label %.lr.ph
+
+._crit_edge:                                      ; preds = %.lr.ph
+  %sum.0.lcssa = phi i32 [ %l7, %.lr.ph ]
+  ret i32 %sum.0.lcssa
+}

@ElvisWang123 ElvisWang123 force-pushed the add-dbg-loc-Reduc-WidenPHI-WidenIntrtinsic branch from 0296fc3 to 2f14e17 Compare December 16, 2024 11:24
Copy link
Contributor

@Mel-Chen Mel-Chen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reminds me of my unfinished patch: #113887
@fhahn Does VPSingleDefRecipe also need a constructor similar to VPRecipeWithIRFlags?

  template <typename IterT>
  VPRecipeWithIRFlags(const unsigned char SC, IterT Operands, Instruction &I)
      : VPSingleDefRecipe(SC, Operands, &I, I.getDebugLoc()) {

@ElvisWang123 ElvisWang123 force-pushed the add-dbg-loc-Reduc-WidenPHI-WidenIntrtinsic branch 2 times, most recently from e46fe8c to 24077c7 Compare December 16, 2024 12:08
@@ -15,8 +15,8 @@ define void @foo(ptr %h) !dbg !4 {
; CHECK-NEXT: [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[FOR_COND_CLEANUP32:%.*]] ]
; CHECK-NEXT: br label [[FOR_COND5_PREHEADER1:%.*]], !dbg [[DBG21]]
; CHECK: for.cond5.preheader1:
; CHECK-NEXT: [[VEC_PHI:%.*]] = phi <4 x i64> [ zeroinitializer, [[VECTOR_BODY]] ], [ [[TMP4:%.*]], [[FOR_COND5_PREHEADER1]] ], !dbg [[DBG21]]
; CHECK-NEXT: [[TMP0:%.*]] = getelementptr i32, ptr [[H]], <4 x i64> [[VEC_PHI]], !dbg [[DBG21]]
; CHECK-NEXT: [[VEC_PHI:%.*]] = phi <4 x i64> [ zeroinitializer, [[VECTOR_BODY]] ], [ [[TMP4:%.*]], [[FOR_COND5_PREHEADER1]] ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the debug loc dropped here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow there is no debug location information in the original scalar phi and following gep.
The previous debug information of the vector phi seems from the previous br lable [[FOR_COND5_PREHEADER1:%.*]], !dbg [[DBG21]], which might be incorrect.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, please split off the changes to VPWidenPHIRecipe then. Technically VPWidenIntrinsic/VPReduction changes are also independent, might be good to keep in mind for future PRs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be great to get split off, part of those changes are also tied up in #118662

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Split to #120338. Thanks.

@ElvisWang123 ElvisWang123 force-pushed the add-dbg-loc-Reduc-WidenPHI-WidenIntrtinsic branch from 963cc5f to 3680566 Compare December 17, 2024 23:51
@ElvisWang123 ElvisWang123 changed the title [VPlan] Set debug location for VPReduction/VPWidenPHI/VPWidenIntrinsicRecipe. [VPlan] Set debug location for VPReduction/VPWidenIntrinsicRecipe. Dec 18, 2024
@ElvisWang123 ElvisWang123 force-pushed the add-dbg-loc-Reduc-WidenPHI-WidenIntrtinsic branch from 3680566 to b6bc76b Compare December 18, 2024 00:28
Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

One more comment to try to make the checks in one of the tests slightly more robust

; DEBUGLOC: = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %{{.+}}), !dbg ![[REDLOC:[0-9]+]]
; DEBUGLOC: loop:
; DEBUGLOC: = load i32, ptr %{{.+}}, align 4, !dbg ![[LOADLOC]]
; DEBUGLOC: = add i32 %{{.+}}, %{{.+}}, !dbg ![[REDLOC]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
; DEBUGLOC: = add i32 %{{.+}}, %{{.+}}, !dbg ![[REDLOC]]
; DEBUGLOC-NEXT: = add i32 %{{.+}}, [[LOAD]], !dbg ![[REDLOC]]

to make sure we match the correct add?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated with [[LOAD]]. We cannot use DEBUGLOC-NEXT since there are some debug information between instructions.

@ElvisWang123 ElvisWang123 force-pushed the add-dbg-loc-Reduc-WidenPHI-WidenIntrtinsic branch from 55089cb to 1654932 Compare December 27, 2024 01:35
@ElvisWang123 ElvisWang123 merged commit 47e1c87 into llvm:main Dec 27, 2024
8 checks passed
@ElvisWang123 ElvisWang123 deleted the add-dbg-loc-Reduc-WidenPHI-WidenIntrtinsic branch December 27, 2024 02:37
ElvisWang123 added a commit that referenced this pull request Jan 24, 2025
)

This patch align the debug location of the widen-phi to the debug
location of original phi.

Split from: #120054
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants