Skip to content

[DebugInfo] Use DW_op_bit_piece for structured bindings of bitfields #85665

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

john-brawn-arm
Copy link
Collaborator

Currently we use DW_OP_plus_uconst to handle the bitfield offset and handle the bitfield size by choosing a type size that matches, but this doesn't work if either offset or size aren't byte-aligned. Using DW_op_bit_piece means we can handle any offset and size.

Currently we use DW_OP_plus_uconst to handle the bitfield offset and
handle the bitfield size by choosing a type size that matches, but
this doesn't work if either offset or size aren't byte-aligned. Using
DW_op_bit_piece means we can handle any offset and size.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. debuginfo llvm:ir labels Mar 18, 2024
@llvmbot
Copy link
Member

llvmbot commented Mar 18, 2024

@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-debuginfo
@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-clang

Author: John Brawn (john-brawn-arm)

Changes

Currently we use DW_OP_plus_uconst to handle the bitfield offset and handle the bitfield size by choosing a type size that matches, but this doesn't work if either offset or size aren't byte-aligned. Using DW_op_bit_piece means we can handle any offset and size.


Patch is 23.89 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/85665.diff

5 Files Affected:

  • (modified) clang/lib/CodeGen/CGDebugInfo.cpp (+9-43)
  • (modified) clang/lib/CodeGen/CGDebugInfo.h (-3)
  • (modified) clang/test/CodeGenCXX/debug-info-structured-binding-bitfield.cpp (+56-62)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfExpression.cpp (+3)
  • (modified) llvm/lib/IR/DebugInfoMetadata.cpp (+2)
diff --git a/clang/lib/CodeGen/CGDebugInfo.cpp b/clang/lib/CodeGen/CGDebugInfo.cpp
index c2c01439f2dc99..bf142bdbbe0f99 100644
--- a/clang/lib/CodeGen/CGDebugInfo.cpp
+++ b/clang/lib/CodeGen/CGDebugInfo.cpp
@@ -4777,40 +4777,6 @@ llvm::DILocalVariable *CGDebugInfo::EmitDeclare(const VarDecl *VD,
   return D;
 }
 
-llvm::DIType *CGDebugInfo::CreateBindingDeclType(const BindingDecl *BD) {
-  llvm::DIFile *Unit = getOrCreateFile(BD->getLocation());
-
-  // If the declaration is bound to a bitfield struct field, its type may have a
-  // size that is different from its deduced declaration type's.
-  if (const MemberExpr *ME = dyn_cast<MemberExpr>(BD->getBinding())) {
-    if (const FieldDecl *FD = dyn_cast<FieldDecl>(ME->getMemberDecl())) {
-      if (FD->isBitField()) {
-        ASTContext &Context = CGM.getContext();
-        const CGRecordLayout &RL =
-            CGM.getTypes().getCGRecordLayout(FD->getParent());
-        const CGBitFieldInfo &Info = RL.getBitFieldInfo(FD);
-
-        // Find an integer type with the same bitwidth as the bitfield size. If
-        // no suitable type is present in the target, give up on producing debug
-        // information as it would be wrong. It is certainly possible to produce
-        // correct debug info, but the logic isn't currently implemented.
-        uint64_t BitfieldSizeInBits = Info.Size;
-        QualType IntTy =
-            Context.getIntTypeForBitwidth(BitfieldSizeInBits, Info.IsSigned);
-        if (IntTy.isNull())
-          return nullptr;
-        Qualifiers Quals = BD->getType().getQualifiers();
-        QualType FinalTy = Context.getQualifiedType(IntTy, Quals);
-        llvm::DIType *Ty = getOrCreateType(FinalTy, Unit);
-        assert(Ty);
-        return Ty;
-      }
-    }
-  }
-
-  return getOrCreateType(BD->getType(), Unit);
-}
-
 llvm::DILocalVariable *CGDebugInfo::EmitDeclare(const BindingDecl *BD,
                                                 llvm::Value *Storage,
                                                 std::optional<unsigned> ArgNo,
@@ -4825,7 +4791,8 @@ llvm::DILocalVariable *CGDebugInfo::EmitDeclare(const BindingDecl *BD,
   if (isa<DeclRefExpr>(BD->getBinding()))
     return nullptr;
 
-  llvm::DIType *Ty = CreateBindingDeclType(BD);
+  llvm::DIFile *Unit = getOrCreateFile(BD->getLocation());
+  llvm::DIType *Ty = getOrCreateType(BD->getType(), Unit);
 
   // If there is no debug info for this type then do not emit debug info
   // for this variable.
@@ -4851,7 +4818,6 @@ llvm::DILocalVariable *CGDebugInfo::EmitDeclare(const BindingDecl *BD,
   unsigned Column = getColumnNumber(BD->getLocation());
   StringRef Name = BD->getName();
   auto *Scope = cast<llvm::DIScope>(LexicalBlockStack.back());
-  llvm::DIFile *Unit = getOrCreateFile(BD->getLocation());
   // Create the descriptor for the variable.
   llvm::DILocalVariable *D = DBuilder.createAutoVariable(
       Scope, Name, Unit, Line, Ty, CGM.getLangOpts().Optimize,
@@ -4865,13 +4831,13 @@ llvm::DILocalVariable *CGDebugInfo::EmitDeclare(const BindingDecl *BD,
       const ASTRecordLayout &layout =
           CGM.getContext().getASTRecordLayout(parent);
       const uint64_t fieldOffset = layout.getFieldOffset(fieldIndex);
-
-      if (fieldOffset != 0) {
-        // Currently if the field offset is not a multiple of byte, the produced
-        // location would not be accurate. Therefore give up.
-        if (fieldOffset % CGM.getContext().getCharWidth() != 0)
-          return nullptr;
-
+      if (FD->isBitField()) {
+        Expr.push_back(llvm::dwarf::DW_OP_bit_piece);
+        Expr.push_back(FD->getBitWidthValue(CGM.getContext()));
+        Expr.push_back(fieldOffset);
+      } else if (fieldOffset != 0) {
+        assert(fieldOffset % CGM.getContext().getCharWidth() == 0 &&
+               "Unexpected non-bitfield with non-byte-aligned offset");
         Expr.push_back(llvm::dwarf::DW_OP_plus_uconst);
         Expr.push_back(
             CGM.getContext().toCharUnitsFromBits(fieldOffset).getQuantity());
diff --git a/clang/lib/CodeGen/CGDebugInfo.h b/clang/lib/CodeGen/CGDebugInfo.h
index 7b60e94555d060..af58ccdb5b35d2 100644
--- a/clang/lib/CodeGen/CGDebugInfo.h
+++ b/clang/lib/CodeGen/CGDebugInfo.h
@@ -337,9 +337,6 @@ class CGDebugInfo {
                                           llvm::DIScope *RecordTy,
                                           const RecordDecl *RD);
 
-  /// Create type for binding declarations.
-  llvm::DIType *CreateBindingDeclType(const BindingDecl *BD);
-
   /// Create an anonnymous zero-size separator for bit-field-decl if needed on
   /// the target.
   llvm::DIDerivedType *createBitFieldSeparatorIfNeeded(
diff --git a/clang/test/CodeGenCXX/debug-info-structured-binding-bitfield.cpp b/clang/test/CodeGenCXX/debug-info-structured-binding-bitfield.cpp
index d9f5e3eacac37d..0cecf74c4bc6f9 100644
--- a/clang/test/CodeGenCXX/debug-info-structured-binding-bitfield.cpp
+++ b/clang/test/CodeGenCXX/debug-info-structured-binding-bitfield.cpp
@@ -8,8 +8,8 @@ struct S0 {
 // CHECK-LABEL: define dso_local void @_Z3fS0v
 // CHECK:                        alloca %struct.S0, align 4
 // CHECK-NEXT:    [[TMP0:%.*]] = alloca %struct.S0, align 4
-// CHECK:         call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S0_A:![0-9]+]], metadata !DIExpression())
-// CHECK-NEXT:    call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S0_B:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 2))
+// CHECK:         call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S0_A:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 16, 0))
+// CHECK-NEXT:    call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S0_B:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 16, 16))
 //
 void fS0() {
   S0 s0;
@@ -24,8 +24,8 @@ struct S1 {
 // CHECK-LABEL: define dso_local void @_Z3fS1v
 // CHECK:                        alloca %struct.S1, align 4
 // CHECK-NEXT:    [[TMP0:%.*]] = alloca %struct.S1, align 4
-// CHECK:         call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S1_A:![0-9]+]], metadata !DIExpression())
-// CHECK-NEXT:    call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S1_B:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 2))
+// CHECK:         call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S1_A:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 16, 0))
+// CHECK-NEXT:    call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S1_B:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 16, 16))
 //
 void fS1() {
   S1 s1;
@@ -40,8 +40,8 @@ struct S2 {
 // CHECK-LABEL: define dso_local void @_Z3fS2v
 // CHECK:                        alloca %struct.S2, align 4
 // CHECK-NEXT:    [[TMP0:%.*]] = alloca %struct.S2, align 4
-// CHECK:         call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S2_A:![0-9]+]], metadata !DIExpression())
-// CHECK-NEXT:    call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S2_B:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 1))
+// CHECK:         call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S2_A:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 8, 0))
+// CHECK-NEXT:    call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S2_B:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 8, 8))
 //
 void fS2() {
   S2 s2;
@@ -56,8 +56,8 @@ struct S3 {
 // CHECK-LABEL: define dso_local void @_Z3fS3v
 // CHECK:                        alloca %struct.S3, align 4
 // CHECK-NEXT:    [[TMP0:%.*]] = alloca %struct.S3, align 4
-// CHECK:         call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S3_A:![0-9]+]], metadata !DIExpression())
-// CHECK-NEXT:    call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S3_B:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 1))
+// CHECK:         call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S3_A:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 8, 0))
+// CHECK-NEXT:    call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S3_B:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 8, 8))
 //
 void fS3() {
   S3 s3;
@@ -72,8 +72,8 @@ struct S4 {
 // CHECK-LABEL: define dso_local void @_Z3fS4v
 // CHECK:                        alloca %struct.S4, align 4
 // CHECK-NEXT:    [[TMP0:%.*]] = alloca %struct.S4, align 4
-// CHECK:         call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S4_A:![0-9]+]], metadata !DIExpression())
-// CHECK-NEXT:    call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S4_B:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 1))
+// CHECK:         call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S4_A:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 8, 0))
+// CHECK-NEXT:    call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S4_B:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 16, 8))
 //
 void fS4() {
   S4 s4;
@@ -88,8 +88,8 @@ struct S5 {
 // CHECK-LABEL: define dso_local void @_Z3fS5v
 // CHECK:                        alloca %struct.S5, align 4
 // CHECK-NEXT:    [[TMP0:%.*]] = alloca %struct.S5, align 4
-// CHECK:         call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S5_A:![0-9]+]], metadata !DIExpression())
-// CHECK-NEXT:    call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S5_B:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 1))
+// CHECK:         call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S5_A:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 8, 0))
+// CHECK-NEXT:    call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S5_B:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 16, 8))
 //
 void fS5() {
   S5 s5;
@@ -104,8 +104,8 @@ struct S6 {
 // CHECK-LABEL: define dso_local void @_Z3fS6v
 // CHECK:                        alloca %struct.S6, align 4
 // CHECK-NEXT:    [[TMP0:%.*]] = alloca %struct.S6, align 4
-// CHECK:         call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S6_A:![0-9]+]], metadata !DIExpression())
-// CHECK-NEXT:    call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S6_B:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 2))
+// CHECK:         call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S6_A:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 16, 0))
+// CHECK-NEXT:    call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S6_B:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 8, 16))
 //
 void fS6() {
   S6 s6;
@@ -120,8 +120,8 @@ struct S7 {
 // CHECK-LABEL: define dso_local void @_Z3fS7v
 // CHECK:                        alloca %struct.S7, align 4
 // CHECK-NEXT:    [[TMP0:%.*]] = alloca %struct.S7, align 4
-// CHECK:         call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S7_A:![0-9]+]], metadata !DIExpression())
-// CHECK-NEXT:    call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S7_B:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 2))
+// CHECK:         call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S7_A:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 16, 0))
+// CHECK-NEXT:    call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S7_B:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 8, 16))
 //
 void fS7() {
   S7 s7;
@@ -136,8 +136,8 @@ struct S8 {
 // CHECK-LABEL: define dso_local void @_Z3fS8v
 // CHECK:                        alloca %struct.S8, align 4
 // CHECK-NEXT:    [[TMP0:%.*]] = alloca %struct.S8, align 4
-// CHECK:         call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S8_A:![0-9]+]], metadata !DIExpression())
-// CHECK-NEXT:    call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S8_B:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 2))
+// CHECK:         call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S8_A:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 16, 0))
+// CHECK-NEXT:    call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S8_B:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 16, 16))
 //
 void fS8() {
   S8 s8;
@@ -152,8 +152,8 @@ struct S9 {
 // CHECK-LABEL: define dso_local void @_Z3fS9v
 // CHECK:                        alloca %struct.S9, align 4
 // CHECK-NEXT:    [[TMP0:%.*]] = alloca %struct.S9, align 4
-// CHECK:         call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S9_A:![0-9]+]], metadata !DIExpression())
-// CHECK-NEXT:    call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S9_B:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 4))
+// CHECK:         call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S9_A:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 16, 0))
+// CHECK-NEXT:    call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S9_B:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 32, 32))
 //
 void fS9() {
   S9 s9;
@@ -167,8 +167,8 @@ struct S10 {
 // CHECK-LABEL: define dso_local void @_Z4fS10v
 // CHECK:                        alloca %struct.S10, align 4
 // CHECK-NEXT:    [[TMP0:%.*]] = alloca %struct.S10, align 4
-// CHECK:         call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S10_A:![0-9]+]], metadata !DIExpression())
-// CHECK-NEXT:    call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S10_B:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 1))
+// CHECK:         call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S10_A:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 8, 0))
+// CHECK-NEXT:    call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S10_B:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 8, 8))
 //
   S10() : x(0), y(0) {}
 };
@@ -178,9 +178,6 @@ void fS10() {
   auto [a, b] = s10;
 }
 
-// It's currently not possible to produce complete debug information for the following cases.
-// Confirm that no wrong debug info is output.
-// Once this is implemented, these tests should be amended.
 struct S11 {
   unsigned int x : 15;
   unsigned int y : 16;
@@ -189,7 +186,8 @@ struct S11 {
 // CHECK-LABEL: define dso_local void @_Z4fS11v
 // CHECK:                        alloca %struct.S11, align 4
 // CHECK-NEXT:    [[TMP0:%.*]] = alloca %struct.S11, align 4
-// CHECK-NOT:     call void @llvm.dbg.declare(metadata ptr [[TMP0]]
+// CHECK:         call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S11_A:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 15, 0))
+// CHECK-NEXT:    call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S11_B:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 16, 15))
 //
 void fS11() {
   S11 s11;
@@ -204,8 +202,8 @@ struct S12 {
 // CHECK-LABEL: define dso_local void @_Z4fS12v
 // CHECK:                        alloca %struct.S12, align 4
 // CHECK-NEXT:    [[TMP0:%.*]] = alloca %struct.S12, align 4
-// CHECK:         call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S12_A:![0-9]+]], metadata !DIExpression())
-// CHECK-NOT:     call void @llvm.dbg.declare(metadata ptr [[TMP0]]
+// CHECK:         call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S12_A:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 16, 0))
+// CHECK-NEXT:    call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S12_B:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 17, 32))
 //
 void fS12() {
   S12 s12;
@@ -220,7 +218,8 @@ struct __attribute__((packed)) S13 {
 // CHECK-LABEL: define dso_local void @_Z4fS13v
 // CHECK:                        alloca %struct.S13, align 1
 // CHECK-NEXT:    [[TMP0:%.*]] = alloca %struct.S13, align 1
-// CHECK-NOT:     call void @llvm.dbg.declare(metadata ptr [[TMP0]]
+// CHECK:         call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S13_A:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 15, 0))
+// CHECK-NEXT:    call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S13_B:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 16, 15))
 //
 void fS13() {
   S13 s13;
@@ -228,55 +227,50 @@ void fS13() {
 }
 
 // CHECK: [[UINT_TY:![0-9]+]] = !DIBasicType(name: "unsigned int", size: 32, encoding: DW_ATE_unsigned)
-// CHECK: [[S0_A]] = !DILocalVariable(name: "a", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[USHORT_TY:![0-9]+]])
-// CHECK: [[USHORT_TY]] = !DIBasicType(name: "unsigned short", size: 16, encoding: DW_ATE_unsigned)
-// CHECK: [[S0_B]] = !DILocalVariable(name: "b", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[USHORT_TY]])
+// CHECK: [[S0_A]] = !DILocalVariable(name: "a", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[UINT_TY]])
+// CHECK: [[S0_B]] = !DILocalVariable(name: "b", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[UINT_TY]])
 
 // CHECK: [[VOLATILE_UINT_TY:![0-9]+]] = !DIDerivedType(tag: DW_TAG_volatile_type, baseType: [[UINT_TY]])
-// CHECK: [[S1_A]] = !DILocalVariable(name: "a", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[VOLATILE_USHORT_TY:![0-9]+]])
-// CHECK: [[VOLATILE_USHORT_TY]] = !DIDerivedType(tag: DW_TAG_volatile_type, baseType: [[USHORT_TY]])
-// CHECK: [[S1_B]] = !DILocalVariable(name: "b", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[VOLATILE_USHORT_TY]])
+// CHECK: [[S1_A]] = !DILocalVariable(name: "a", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[VOLATILE_UINT_TY:![0-9]+]])
+// CHECK: [[S1_B]] = !DILocalVariable(name: "b", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[VOLATILE_UINT_TY]])
 
-// CHECK: [[S2_A]] = !DILocalVariable(name: "a", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[UCHAR_TY:![0-9]+]])
-// CHECK: [[UCHAR_TY]] = !DIBasicType(name: "unsigned char", size: 8, encoding: DW_ATE_unsigned_char)
-// CHECK: [[S2_B]] = !DILocalVariable(name: "b", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[UCHAR_TY]])
+// CHECK: [[S2_A]] = !DILocalVariable(name: "a", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[UINT_TY]])
+// CHECK: [[S2_B]] = !DILocalVariable(name: "b", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[UINT_TY]])
 
-// CHECK: [[S3_A]] = !DILocalVariable(name: "a", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[VOLATILE_UCHAR_TY:![0-9]+]])
-// CHECK: [[VOLATILE_UCHAR_TY]] = !DIDerivedType(tag: DW_TAG_volatile_type, baseType: [[UCHAR_TY]])
-// CHECK: [[S3_B]] = !DILocalVariable(name: "b", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[VOLATILE_UCHAR_TY]])
+// CHECK: [[S3_A]] = !DILocalVariable(name: "a", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[VOLATILE_UINT_TY]])
+// CHECK: [[S3_B]] = !DILocalVariable(name: "b", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[VOLATILE_UINT_TY]])
 
-// CHECK: [[S4_A]] = !DILocalVariable(name: "a", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[UCHAR_TY]])
-// CHECK: [[S4_B]] = !DILocalVariable(name: "b", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[USHORT_TY]])
+// CHECK: [[S4_A]] = !DILocalVariable(name: "a", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[UINT_TY]])
+// CHECK: [[S4_B]] = !DILocalVariable(name: "b", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[UINT_TY]])
 
-// CHECK: [[S5_A]] = !DILocalVariable(name: "a", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[VOLATILE_UCHAR_TY]])
-// CHECK: [[S5_B]] = !DILocalVariable(name: "b", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[VOLATILE_USHORT_TY]])
+// CHECK: [[S5_A]] = !DILocalVariable(name: "a", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[VOLATILE_UINT_TY]])
+// CHECK: [[S5_B]] = !DILocalVariable(name: "b", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[VOLATILE_UINT_TY]])
 
-// CHECK: [[S6_A]] = !DILocalVariable(name: "a", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[USHORT_TY]])
-// CHECK: [[S6_B]] = !DILocalVariable(name: "b", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[UCHAR_TY]])
+// CHECK: [[S6_A]] = !DILocalVariable(name: "a", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[UINT_TY]])
+// CHECK: [[S6_B]] = !DILocalVariable(name: "b", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[UINT_TY]])
 
-// CHECK: [[S7_A]] = !DILocalVariable(name: "a", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[VOLATILE_USHORT_TY]])
-// CHECK: [[S7_B]] = !DILocalVariable(name: "b", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[VOLATILE_UCHAR_TY]])
+// CHECK: [[S7_A]] = !DILocalVariable(name: "a", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[VOLATILE_UINT_TY]])
+// CHECK: [[S7_B]] = !DILocalVariable(name: "b", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[VOLATILE_UINT_TY]])
 
-// CHECK: [[S8_A]] = !DILocalVariable(name: "a", scope: {{.*}}, file: {{.*}}, line: {{.*}}, type: [[USHOR...
[truncated]

@@ -8,8 +8,8 @@ struct S0 {
// CHECK-LABEL: define dso_local void @_Z3fS0v
// CHECK: alloca %struct.S0, align 4
// CHECK-NEXT: [[TMP0:%.*]] = alloca %struct.S0, align 4
// CHECK: call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S0_A:![0-9]+]], metadata !DIExpression())
// CHECK-NEXT: call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S0_B:![0-9]+]], metadata !DIExpression(DW_OP_plus_uconst, 2))
// CHECK: call void @llvm.dbg.declare(metadata ptr [[TMP0]], metadata [[S0_A:![0-9]+]], metadata !DIExpression(DW_OP_bit_piece, 16, 0))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally don't use DW_OP_bit_piece directly in LLVM IR. The correct thing to do is to use DW_OP_LLVM_fragment instead, which then gets lowered to a bit_piece in AsmPrinter.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LLVM langref says "Note that contrary to DW_OP_bit_piece, the offset is describing the location within the described source variable". What I think this means is that while DW_OP_bit_piece means "you get the source variable by extracting these bits of this location", DW_OP_LLVM_fragment means "the value of this location is these bits of the source variable".

Trying out DW_OP_LLVM_fragment in

struct {
  unsigned int a : 8;
  unsigned int b : 8;
} X = { 1, 2 };

int main() {
  auto [a, b] = X;
  return a + b;
}

the resulting dwarf info looks pretty strange

 <2><7f>: Abbrev Number: 7 (DW_TAG_variable)
    <80>   DW_AT_location    : 4 byte block: 93 1 91 78         (DW_OP_piece: 1; DW_OP_fbreg: -8)
    <85>   DW_AT_name        : (indirect string, offset: 0x9c): a
    <89>   DW_AT_decl_file   : 1
    <8a>   DW_AT_decl_line   : 7
    <8b>   DW_AT_type        : <0x5f>
 <2><8f>: Abbrev Number: 7 (DW_TAG_variable)
    <90>   DW_AT_location    : 6 byte block: 93 1 91 78 93 1    (DW_OP_piece: 1; DW_OP_fbreg: -8; DW_OP_piece: 1)
    <97>   DW_AT_name        : (indirect string, offset: 0xab): b
    <9b>   DW_AT_decl_file   : 1
    <9c>   DW_AT_decl_line   : 7
    <9d>   DW_AT_type        : <0x5f>

lldb says (at line 8)

(lldb) p a
(unsigned int) 513
(lldb) p b
(unsigned int) 256

gdb appears to not like this dwarf info and just says

(gdb) p a
$1 = <optimized out>
(gdb) p b
$2 = <optimized out>

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://dwarfstd.org/issues/191025.1.html
https://dwarfstd.org/issues/161206.2.html

I think you are right about the semantics of DW_OP_bit_piece.
I also think I misunderstood the problem the first time around.

Here a and b are not fragments of a large struct, but we want to express the opposite. They point into the same alloca. Here's incorrect ToT output:

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  %0 = alloca %struct.anon, align 4
  store i32 0, ptr %retval, align 4
  tail call void @llvm.dbg.declare(metadata ptr %0, metadata !23, metadata !DIExpression()), !dbg !25
  tail call void @llvm.dbg.declare(metadata ptr %0, metadata !26, metadata !DIExpression(DW_OP_plus_uconst, 1)), !dbg !27

...
!23 = !DILocalVariable(name: "a", scope: !18, file: !5, line: 2, type: !24)
!26 = !DILocalVariable(name: "b", scope: !18, file: !5, line: 2, type: !24)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we could add DW_OP_bit_piece to LLVM, or we could emit

!DIExpression()
!DIExpression(DW_OP_shr 8)

Which, I believe, would be supported by LLVM today without any modifications?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless of which variant we choose, we probably need to add explicit support for it to SROA.cpp to make sure it gets translated correctly when the alloca gets split up again.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SROA support doesn't need to hold up this patch of course, I'm just mentioning it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we could add DW_OP_bit_piece to LLVM, or we could emit

!DIExpression()
!DIExpression(DW_OP_shr 8)

Which, I believe, would be supported by LLVM today without any modifications?

That would handle the bitfield offset, but wouldn't get the bitfield size right (we could use the existing CreateBindingDeclType but that can only handle cases where the bitfield size exactly matches an integer type size).

Regardless of which variant we choose, we probably need to add explicit support for it to SROA.cpp to make sure it gets translated correctly when the alloca gets split up again.

I'm currently working on this, should have a patch for it soon (though it looks like we only need changes in DIExpression::createFragmentExpression).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/john-brawn-arm/llvm-project/tree/sroa_bit_piece makes sroa work with bit_piece, I'll create a pull requeset once this on is merged.

@john-brawn-arm
Copy link
Collaborator Author

Ping.

Copy link
Collaborator

@pogo59 pogo59 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LLVM changes don't have a test. I expect there are existing expression-to-final-DWARF tests that you can modify to throw in a few bit-piece operators and show they come out correctly.

Offhand it seems quite reasonable to use DW_OP_bit_piece for bitfields.

@jryans
Copy link
Member

jryans commented May 8, 2024

It would be great to add a few notes to document this additional op in the DIExpression list of ops. It would be especially good for those added docs to clarify the difference between this and DW_OP_LLVM_fragment. (I do understand that this DW_OP_bit_piece op is part of standard DWARF, and there are already some standard DWARF ops we use that are missing from those docs already, but even with that in mind ... it would be great to have new docs added for this additional op.)

found in the sequence of bits at the given size and offset (``8`` and ``16``
here, respectively) within the top of the expression stack. Note that the
offset and size are the opposite way around to ``DW_OP_LLVM_fragment``, and the
offset is within the LLVM variable (if that's at the top of the stack).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that is missing right now is discussion whether DW_OP_bit_piece can be used inside of a DW_OP_LLVM_fragment expression.

  • If a DW_OP_bit_piece generates a fragment on its own, then combining the two should be made illegal in the verifier, and then it needs to be handled everywhere we handle DW_OP_LLVM_fragment.
  • If the two can be combined then DWARFExpression needs handle that case, and we probably need a few more testcases for all the edge cases.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we need more testcases for the edge cases regardless of which way we decide.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's already a verifier check that DW_OP_LLVM_fragment is the last expression in the list, so I've made DW_OP_bit_piece do the same.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that effectively implements option (1) that I outlined above. I think this is a reasonable choice and consistent with how DWARF uses them. So if we do this (and I think that's fine) we need make sure that the bit pieces are correctly reported by DIExpression::getFragmentInfo() or else they will get incorrectly treated by every pass that needs to determine overlapping fragments.
At the very least there should be tests that have LLVM IR that mix & matches DW_OP_bit_piece and DW_OP_LLVM_fragment for the same variable and confirm that this gets lowered into valid DWARF.
Alternatively, if you want to make quick progress, you can also say that we don't support this for now and actively reject mixing the two within the same variable in the Verifyer. The you'd need to make sure that passes like SROA that introduce new fragments deal with this gracefully by creating undef locations.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that effectively implements option (1) that I outlined above. I think this is a reasonable choice and consistent with how DWARF uses them. So if we do this (and I think that's fine) we need make sure that the bit pieces are correctly reported by DIExpression::getFragmentInfo() or else they will get incorrectly treated by every pass that needs to determine overlapping fragments.

I'm handling this in the next part, which is SROA support (https://github.com/john-brawn-arm/llvm-project/tree/sroa_bit_piece), though it turns out in most of the places that getFragmentInfo is used it actually is specific to DW_OP_LLVM_fragment and does the wrong thing for DW_OP_bit_piece, so I think adding a separate getBitPieceInfo is simpler.

At the very least there should be tests that have LLVM IR that mix & matches DW_OP_bit_piece and DW_OP_LLVM_fragment for the same variable and confirm that this gets lowered into valid DWARF. Alternatively, if you want to make quick progress, you can also say that we don't support this for now and actively reject mixing the two within the same variable in the Verifyer. The you'd need to make sure that passes like SROA that introduce new fragments deal with this gracefully by creating undef locations.

Currently if the same variable has a DW_OP_LLVM_fragment and anything else this causes assertion failures in DebugLocEntry::finalize in llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp.

Copy link
Collaborator

@adrian-prantl adrian-prantl May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it turns out in most of the places that getFragmentInfo is used it actually is specific to DW_OP_LLVM_fragment and does the wrong thing for DW_OP_bit_piece, so I think adding a separate getBitPieceInfo is simpler.

I am mostly worried about the places that want to answer questions like DIExpression::fragmentsOverlap() in order to determine whether a fragment (or bit_piece) should replace or augment another dbg.value or DBG_VALUE for the same variable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DW_OP_bit_piece is like every other standard dwarf expression: if we see a new location for a variable then the variable is now at that location. The current behaviour for DIExpression::fragmentsOverlap() is that it returns true when the expression isn't a fragment, and that's what we want for DW_OP_bit_piece.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DW_OP_bit_piece is like every other standard dwarf expression

Not it's not. It's a composite location description delimiter and cannot appear inside a DWARF expression. (See DWARF 5 section 2.6.1.2). That's why it's important that all LLVM algorithms that need to reason about fragments (the LLVM equivalent of composite location descriptions) are aware of them.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see, I hadn't noticed that (I'd just read the description of DW_OP_bit_piece and hadn't read the general description of composite location descriptions). I had been understanding something like "DW_OP_reg0 RAX, DW_OP_bit_piece 0x3 0x0" to mean "this variable can be found in the 3 bits at offset 0 in RAX", but it actually means something closer to "the bottom 3 bits of this variable can be found in the 3 bits at offset 0 in RAX".

I think this doesn't make much difference though if we restrict DW_OP_bit_piece in the llvm.dbg intrinsics to use it only for its ability to describe a value stored in part of a register, with the other bits not being described, as I think that's what the behaviour I've currently implemented is. Though maybe it would be better to instead use DW_OP_and and DW_OP_shr to extract the bits from the register. I'll think about this some more.

@john-brawn-arm
Copy link
Collaborator Author

After doing some more testing it turns out that DW_OP_bit_piece actually doesn't work when we have a signed bitfield, as both gdb and lldb treat the upper bits as zero. I think instead we have to do a sequence of operations to extract the relevant bits from the value. I've created #93990 for adding a DW_OP for doing this, with patches for using it in clang and making optimisations handle it on the way.

@john-brawn-arm john-brawn-arm deleted the bitfield_debug_info branch July 16, 2024 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category debuginfo llvm:ir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants