Skip to content

[HLSL] Codegen for cbuffer declarations without embedded arrays or structs #119755

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

hekota
Copy link
Member

@hekota hekota commented Dec 12, 2024

Introduces translations of cbuffer to LLVM target type target("dx.CBuffer", ...) . At this point only cbuffers containing scalars and vectors are supported as we are still working on the target type design for the arrays and structs.

The current target type includes a type specifying the shape of the cbuffer declaration, the size, and list of offsets for each constant in the constant buffer layout. Note that this format might change as we work on adding support for arrays and structs.

Fixes #118531
Fixes #106596

@@ -164,18 +164,18 @@ Decl *SemaHLSL::ActOnStartBuffer(Scope *BufferScope, bool CBuffer,
return Result;
}

// Calculate the size of a legacy cbuffer type based on
// Calculate the size of a legacy cbuffer type in bytes based on
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cbuffer & packoffset layout calculations have been using 3 different units - bits, bytes, and 4-byte chunks. I am changing all to use bytes.

@@ -54,69 +54,110 @@ void addDxilValVersion(StringRef ValVersionStr, llvm::Module &M) {
auto *DXILValMD = M.getOrInsertNamedMetadata(DXILValKey);
DXILValMD->addOperand(Val);
}

void addDisableOptimizations(llvm::Module &M) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentional white space change to separate the functions.

}

// CHECK: @[[TB:.+]] = external constant { float, double }
tbuffer A : register(t2, space1) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing tbuffer codegen tests as it is not supported yet (although it will probably be very similar to cbuffer).

@hekota hekota marked this pull request as ready for review December 12, 2024 20:38
@hekota hekota requested a review from bogner December 12, 2024 20:38
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. HLSL HLSL Language Support labels Dec 12, 2024
@llvmbot
Copy link
Member

llvmbot commented Dec 12, 2024

@llvm/pr-subscribers-backend-directx
@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-hlsl

@llvm/pr-subscribers-clang

Author: Helena Kotas (hekota)

Changes

Introduces translations of cbuffer to LLVM target type target("dx.CBuffer", ...) . At this point only cbuffers containing scalars and vectors are supported as we are still working on the target type design for the arrays and structs.

The current target type includes a type specifying the shape of the cbuffer declaration, the size, and list of offsets for each constant in the constant buffer layout. Note that this format might change as we work on adding support for arrays and structs.

Fixes #118531
Fixes #106596


Patch is 36.84 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/119755.diff

15 Files Affected:

  • (modified) clang/include/clang/Basic/Attr.td (+3-3)
  • (modified) clang/lib/CodeGen/CGDeclCXX.cpp (+4-8)
  • (modified) clang/lib/CodeGen/CGHLSLRuntime.cpp (+155-94)
  • (modified) clang/lib/CodeGen/CGHLSLRuntime.h (+34-11)
  • (modified) clang/lib/Sema/SemaHLSL.cpp (+8-7)
  • (modified) clang/test/CodeGenHLSL/builtins/ByteAddressBuffers-constructors.hlsl (+4-4)
  • (modified) clang/test/CodeGenHLSL/builtins/RWBuffer-constructor.hlsl (+5-5)
  • (modified) clang/test/CodeGenHLSL/builtins/StructuredBuffers-constructors.hlsl (+4-4)
  • (removed) clang/test/CodeGenHLSL/cbuf.hlsl (-26)
  • (removed) clang/test/CodeGenHLSL/cbuf_in_namespace.hlsl (-23)
  • (added) clang/test/CodeGenHLSL/cbuffer.hlsl (+67)
  • (added) clang/test/CodeGenHLSL/cbuffer_and_namespaces.hlsl (+47)
  • (added) clang/test/CodeGenHLSL/cbuffer_with_packoffset.hlsl (+33)
  • (added) clang/test/CodeGenHLSL/cbuffer_with_static_global_and_function.hlsl (+28)
  • (removed) clang/test/CodeGenHLSL/static_global_and_function_in_cb.hlsl (-17)
diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td
index 90d2a2056fe1ba..f615f45a9524d8 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -4711,9 +4711,9 @@ def HLSLPackOffset: HLSLAnnotationAttr {
   let Args = [IntArgument<"Subcomponent">, IntArgument<"Component">];
   let Documentation = [HLSLPackOffsetDocs];
   let AdditionalMembers = [{
-      unsigned getOffset() {
-        return subcomponent * 4 + component;
-      }
+  unsigned getOffsetInBytes() {
+    return subcomponent * 16 + component * 4;
+  }
   }];
 }
 
diff --git a/clang/lib/CodeGen/CGDeclCXX.cpp b/clang/lib/CodeGen/CGDeclCXX.cpp
index 2c3054605ee754..2869150a5e6648 100644
--- a/clang/lib/CodeGen/CGDeclCXX.cpp
+++ b/clang/lib/CodeGen/CGDeclCXX.cpp
@@ -886,6 +886,10 @@ CodeGenModule::EmitCXXGlobalInitFunc() {
       ModuleInits.push_back(Fn);
     }
 
+  if (getLangOpts().HLSL && getHLSLRuntime().needsResourceBindingInitFn()) {
+    CXXGlobalInits.push_back(getHLSLRuntime().createResourceBindingInitFn());
+  }
+
   if (ModuleInits.empty() && CXXGlobalInits.empty() &&
       PrioritizedCXXGlobalInits.empty())
     return;
@@ -1127,14 +1131,6 @@ CodeGenFunction::GenerateCXXGlobalInitFunc(llvm::Function *Fn,
       if (Decls[i])
         EmitRuntimeCall(Decls[i]);
 
-    if (getLangOpts().HLSL) {
-      CGHLSLRuntime &CGHLSL = CGM.getHLSLRuntime();
-      if (CGHLSL.needsResourceBindingInitFn()) {
-        llvm::Function *ResInitFn = CGHLSL.createResourceBindingInitFn();
-        Builder.CreateCall(llvm::FunctionCallee(ResInitFn), {});
-      }
-    }
-
     Scope.ForceCleanup();
 
     if (ExitBlock) {
diff --git a/clang/lib/CodeGen/CGHLSLRuntime.cpp b/clang/lib/CodeGen/CGHLSLRuntime.cpp
index fb15b1993e74ad..cfa77ccd1054d6 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.cpp
+++ b/clang/lib/CodeGen/CGHLSLRuntime.cpp
@@ -54,69 +54,110 @@ void addDxilValVersion(StringRef ValVersionStr, llvm::Module &M) {
   auto *DXILValMD = M.getOrInsertNamedMetadata(DXILValKey);
   DXILValMD->addOperand(Val);
 }
+
 void addDisableOptimizations(llvm::Module &M) {
   StringRef Key = "dx.disable_optimizations";
   M.addModuleFlag(llvm::Module::ModFlagBehavior::Override, Key, 1);
 }
-// cbuffer will be translated into global variable in special address space.
-// If translate into C,
-// cbuffer A {
-//   float a;
-//   float b;
-// }
-// float foo() { return a + b; }
-//
-// will be translated into
-//
-// struct A {
-//   float a;
-//   float b;
-// } cbuffer_A __attribute__((address_space(4)));
-// float foo() { return cbuffer_A.a + cbuffer_A.b; }
-//
-// layoutBuffer will create the struct A type.
-// replaceBuffer will replace use of global variable a and b with cbuffer_A.a
-// and cbuffer_A.b.
-//
-void layoutBuffer(CGHLSLRuntime::Buffer &Buf, const DataLayout &DL) {
-  if (Buf.Constants.empty())
-    return;
+
+// Creates the LLVM struct type representing the shape of the constant buffer,
+// which will be included in the LLVM target type, and calculates the memory
+// layout and constant buffer layout offsets of each constant.
+static void layoutBuffer(CGHLSLRuntime::Buffer &Buf, const DataLayout &DL) {
+  assert(!Buf.Constants.empty() &&
+         "empty constant buffer should not be created");
 
   std::vector<llvm::Type *> EltTys;
-  for (auto &Const : Buf.Constants) {
-    GlobalVariable *GV = Const.first;
-    Const.second = EltTys.size();
+  unsigned MemOffset = 0, CBufOffset = 0, Size = 0;
+
+  for (auto &C : Buf.Constants) {
+    GlobalVariable *GV = C.GlobalVar;
     llvm::Type *Ty = GV->getValueType();
+
+    assert(!Ty->isArrayTy() && !Ty->isStructTy() &&
+           "arrays and structs in cbuffer are not yet implemened");
+
+    // scalar type, vector or matrix
     EltTys.emplace_back(Ty);
+    unsigned FieldSize = Ty->getScalarSizeInBits() / 8;
+    if (Ty->isVectorTy())
+      FieldSize *= cast<FixedVectorType>(Ty)->getNumElements();
+    assert(FieldSize <= 16 && "field side larger than constant buffer row");
+
+    // set memory layout offset (no padding)
+    C.MemOffset = MemOffset;
+    MemOffset += FieldSize;
+
+    // calculate cbuffer layout offset or update total cbuffer size from
+    // packoffset annotations
+    if (Buf.HasPackoffset) {
+      assert(C.CBufferOffset != UINT_MAX &&
+             "cbuffer offset should have been set from packoffset attribute");
+      unsigned OffsetAfterField = C.CBufferOffset + FieldSize;
+      if (Size < OffsetAfterField)
+        Size = OffsetAfterField;
+    } else {
+      // allign to the size of the field
+      CBufOffset = llvm::alignTo(CBufOffset, FieldSize);
+      C.CBufferOffset = CBufOffset;
+      CBufOffset += FieldSize;
+      Size = CBufOffset;
+    }
   }
   Buf.LayoutStruct = llvm::StructType::get(EltTys[0]->getContext(), EltTys);
+  Buf.Size = Size;
 }
 
-GlobalVariable *replaceBuffer(CGHLSLRuntime::Buffer &Buf) {
-  // Create global variable for CB.
-  GlobalVariable *CBGV = new GlobalVariable(
-      Buf.LayoutStruct, /*isConstant*/ true,
-      GlobalValue::LinkageTypes::ExternalLinkage, nullptr,
-      llvm::formatv("{0}{1}", Buf.Name, Buf.IsCBuffer ? ".cb." : ".tb."),
-      GlobalValue::NotThreadLocal);
+// Creates LLVM target type target("dx.CBuffer",..) for the constant buffer.
+// The target type includes the LLVM struct type representing the shape
+// of the constant buffer, size, and a list of offsets for each fields
+// in cbuffer layout.
+static llvm::Type *getBufferTargetType(LLVMContext &Ctx,
+                                       CGHLSLRuntime::Buffer &Buf) {
+  assert(Buf.LayoutStruct != nullptr && Buf.Size != UINT_MAX &&
+         "the buffer layout has not been calculated yet");
+  llvm::SmallVector<unsigned> SizeAndOffsets;
+  SizeAndOffsets.reserve(Buf.Constants.size() + 1);
+  SizeAndOffsets.push_back(Buf.Size);
+  for (CGHLSLRuntime::BufferConstant &C : Buf.Constants) {
+    SizeAndOffsets.push_back(C.CBufferOffset);
+  }
+  return llvm::TargetExtType::get(Ctx, "dx.CBuffer", {Buf.LayoutStruct},
+                                  SizeAndOffsets);
+}
 
-  IRBuilder<> B(CBGV->getContext());
-  Value *ZeroIdx = B.getInt32(0);
-  // Replace Const use with CB use.
-  for (auto &[GV, Offset] : Buf.Constants) {
-    Value *GEP =
-        B.CreateGEP(Buf.LayoutStruct, CBGV, {ZeroIdx, B.getInt32(Offset)});
-
-    assert(Buf.LayoutStruct->getElementType(Offset) == GV->getValueType() &&
-           "constant type mismatch");
-
-    // Replace.
-    GV->replaceAllUsesWith(GEP);
-    // Erase GV.
-    GV->removeDeadConstantUsers();
-    GV->eraseFromParent();
+// Replaces all uses of the temporary constant buffer global variables with
+// buffer access intrinsic resource.getpointer.
+static void replaceBufferGlobals(CodeGenModule &CGM,
+                                 CGHLSLRuntime::Buffer &Buf) {
+  assert(Buf.IsCBuffer && "tbuffer codegen is not yet supported");
+
+  GlobalVariable *BufGV = Buf.GlobalVar;
+  for (auto &Constant : Buf.Constants) {
+    GlobalVariable *ConstGV = Constant.GlobalVar;
+
+    // TODO: Map to an hlsl_device address space.
+    llvm::Type *RetTy = ConstGV->getType();
+    llvm::Type *TargetTy = BufGV->getValueType();
+
+    // Replace all uses of GV with CBuffer access
+    while (ConstGV->use_begin() != ConstGV->use_end()) {
+      Use &U = *ConstGV->use_begin();
+      if (Instruction *UserInstr = dyn_cast<Instruction>(U.getUser())) {
+        IRBuilder<> Builder(UserInstr);
+        Value *Handle = Builder.CreateLoad(TargetTy, BufGV);
+        Value *ResGetPointer = Builder.CreateIntrinsic(
+            RetTy, Intrinsic::dx_resource_getpointer,
+            ArrayRef<llvm::Value *>{Handle,
+                                    Builder.getInt32(Constant.MemOffset)});
+        U.set(ResGetPointer);
+      } else {
+        llvm_unreachable("unexpected use of constant value");
+      }
+    }
+    ConstGV->removeDeadConstantUsers();
+    ConstGV->eraseFromParent();
   }
-  return CBGV;
 }
 
 } // namespace
@@ -143,6 +184,9 @@ void CGHLSLRuntime::addConstant(VarDecl *D, Buffer &CB) {
     return;
   }
 
+  assert(!D->getType()->isArrayType() && !D->getType()->isStructureType() &&
+         "codegen for arrays and structs in cbuffer is not yet supported");
+
   auto *GV = cast<GlobalVariable>(CGM.GetAddrOfGlobalVar(D));
   // Add debug info for constVal.
   if (CGDebugInfo *DI = CGM.getModuleDebugInfo())
@@ -150,13 +194,12 @@ void CGHLSLRuntime::addConstant(VarDecl *D, Buffer &CB) {
         codegenoptions::DebugInfoKind::LimitedDebugInfo)
       DI->EmitGlobalVariable(cast<GlobalVariable>(GV), D);
 
-  // FIXME: support packoffset.
-  // See https://github.com/llvm/llvm-project/issues/57914.
-  uint32_t Offset = 0;
-  bool HasUserOffset = false;
+  CB.Constants.emplace_back(GV);
 
-  unsigned LowerBound = HasUserOffset ? Offset : UINT_MAX;
-  CB.Constants.emplace_back(std::make_pair(GV, LowerBound));
+  if (HLSLPackOffsetAttr *PO = D->getAttr<HLSLPackOffsetAttr>()) {
+    CB.HasPackoffset = true;
+    CB.Constants.back().CBufferOffset = PO->getOffsetInBytes();
+  }
 }
 
 void CGHLSLRuntime::addBufferDecls(const DeclContext *DC, Buffer &CB) {
@@ -173,9 +216,37 @@ void CGHLSLRuntime::addBufferDecls(const DeclContext *DC, Buffer &CB) {
   }
 }
 
+// Creates temporary global variables for all declarations within the constant
+// buffer context, calculates the buffer layouts, and then creates a global
+// variable for the constant buffer and adds it to the module.
+// All uses of the temporary constant globals will be replaced with buffer
+// access intrinsic resource.getpointer in CGHLSLRuntime::finishCodeGen.
+// Later on in DXILResourceAccess pass these will be transtaled
+// to dx.op.cbufferLoadLegacy instructions.
 void CGHLSLRuntime::addBuffer(const HLSLBufferDecl *D) {
-  Buffers.emplace_back(Buffer(D));
-  addBufferDecls(D, Buffers.back());
+  llvm::Module &M = CGM.getModule();
+  const DataLayout &DL = M.getDataLayout();
+
+  assert(D->isCBuffer() && "tbuffer codegen is not supported yet");
+
+  Buffer &Buf = Buffers.emplace_back(D);
+  addBufferDecls(D, Buf);
+  if (Buf.Constants.empty()) {
+    // empty constant buffer - do not add to globals
+    Buffers.pop_back();
+    return;
+  }
+  layoutBuffer(Buf, DL);
+
+  // Create global variable for CB.
+  llvm::Type *TargetTy = getBufferTargetType(CGM.getLLVMContext(), Buf);
+  Buf.GlobalVar = new GlobalVariable(
+      TargetTy, /*isConstant*/ true, GlobalValue::LinkageTypes::ExternalLinkage,
+      nullptr, llvm::formatv("{0}{1}", Buf.Name, Buf.IsCBuffer ? ".cb" : ".tb"),
+      GlobalValue::NotThreadLocal);
+
+  M.insertGlobalVariable(Buf.GlobalVar);
+  ResourcesToBind.emplace_back(Buf.Decl, Buf.GlobalVar);
 }
 
 void CGHLSLRuntime::finishCodeGen() {
@@ -189,26 +260,14 @@ void CGHLSLRuntime::finishCodeGen() {
   if (CGM.getCodeGenOpts().OptimizationLevel == 0)
     addDisableOptimizations(M);
 
-  const DataLayout &DL = M.getDataLayout();
-
   for (auto &Buf : Buffers) {
-    layoutBuffer(Buf, DL);
-    GlobalVariable *GV = replaceBuffer(Buf);
-    M.insertGlobalVariable(GV);
-    llvm::hlsl::ResourceClass RC = Buf.IsCBuffer
-                                       ? llvm::hlsl::ResourceClass::CBuffer
-                                       : llvm::hlsl::ResourceClass::SRV;
-    llvm::hlsl::ResourceKind RK = Buf.IsCBuffer
-                                      ? llvm::hlsl::ResourceKind::CBuffer
-                                      : llvm::hlsl::ResourceKind::TBuffer;
-    addBufferResourceAnnotation(GV, RC, RK, /*IsROV=*/false,
-                                llvm::hlsl::ElementType::Invalid, Buf.Binding);
+    replaceBufferGlobals(CGM, Buf);
   }
 }
 
 CGHLSLRuntime::Buffer::Buffer(const HLSLBufferDecl *D)
-    : Name(D->getName()), IsCBuffer(D->isCBuffer()),
-      Binding(D->getAttr<HLSLResourceBindingAttr>()) {}
+    : Name(D->getName()), IsCBuffer(D->isCBuffer()), HasPackoffset(false),
+      LayoutStruct(nullptr), Decl(D), GlobalVar(nullptr) {}
 
 void CGHLSLRuntime::addBufferResourceAnnotation(llvm::GlobalVariable *GV,
                                                 llvm::hlsl::ResourceClass RC,
@@ -237,7 +296,7 @@ void CGHLSLRuntime::addBufferResourceAnnotation(llvm::GlobalVariable *GV,
          "ResourceMD must have been set by the switch above.");
 
   llvm::hlsl::FrontendResource Res(
-      GV, RK, ET, IsROV, Binding.Reg.value_or(UINT_MAX), Binding.Space);
+      GV, RK, ET, IsROV, Binding.Slot.value_or(UINT_MAX), Binding.Space);
   ResourceMD->addOperand(Res.getMetadata());
 }
 
@@ -328,12 +387,8 @@ void CGHLSLRuntime::annotateHLSLResource(const VarDecl *D, GlobalVariable *GV) {
 CGHLSLRuntime::BufferResBinding::BufferResBinding(
     HLSLResourceBindingAttr *Binding) {
   if (Binding) {
-    llvm::APInt RegInt(64, 0);
-    Binding->getSlot().substr(1).getAsInteger(10, RegInt);
-    Reg = RegInt.getLimitedValue();
-    llvm::APInt SpaceInt(64, 0);
-    Binding->getSpace().substr(5).getAsInteger(10, SpaceInt);
-    Space = SpaceInt.getLimitedValue();
+    Slot = Binding->getSlotNumber();
+    Space = Binding->getSpaceNumber();
   } else {
     Space = 0;
   }
@@ -572,24 +627,30 @@ llvm::Function *CGHLSLRuntime::createResourceBindingInitFn() {
   const DataLayout &DL = CGM.getModule().getDataLayout();
   Builder.SetInsertPoint(EntryBB);
 
-  for (const auto &[VD, GV] : ResourcesToBind) {
-    for (Attr *A : VD->getAttrs()) {
+  for (const auto &[Decl, GV] : ResourcesToBind) {
+    for (Attr *A : Decl->getAttrs()) {
       HLSLResourceBindingAttr *RBA = dyn_cast<HLSLResourceBindingAttr>(A);
       if (!RBA)
         continue;
 
-      const HLSLAttributedResourceType *AttrResType =
-          HLSLAttributedResourceType::findHandleTypeOnResource(
-              VD->getType().getTypePtr());
-
-      // FIXME: Only simple declarations of resources are supported for now.
-      // Arrays of resources or resources in user defined classes are
-      // not implemented yet.
-      assert(AttrResType != nullptr &&
-             "Resource class must have a handle of HLSLAttributedResourceType");
-
-      llvm::Type *TargetTy =
-          CGM.getTargetCodeGenInfo().getHLSLType(CGM, AttrResType);
+      llvm::Type *TargetTy = nullptr;
+      if (const VarDecl *VD = dyn_cast<VarDecl>(Decl)) {
+        const HLSLAttributedResourceType *AttrResType =
+            HLSLAttributedResourceType::findHandleTypeOnResource(
+                VD->getType().getTypePtr());
+
+        // FIXME: Only simple declarations of resources are supported for now.
+        // Arrays of resources or resources in user defined classes are
+        // not implemented yet.
+        assert(
+            AttrResType != nullptr &&
+            "Resource class must have a handle of HLSLAttributedResourceType");
+
+        TargetTy = CGM.getTargetCodeGenInfo().getHLSLType(CGM, AttrResType);
+      } else {
+        assert(isa<HLSLBufferDecl>(Decl));
+        TargetTy = GV->getValueType();
+      }
       assert(TargetTy != nullptr &&
              "Failed to convert resource handle to target type");
 
@@ -604,7 +665,7 @@ llvm::Function *CGHLSLRuntime::createResourceBindingInitFn() {
 
       llvm::Value *CreateHandle = Builder.CreateIntrinsic(
           /*ReturnType=*/TargetTy, getCreateHandleFromBindingIntrinsic(), Args,
-          nullptr, Twine(VD->getName()).concat("_h"));
+          nullptr, Twine(Decl->getName()).concat("_h"));
 
       llvm::Value *HandleRef =
           Builder.CreateStructGEP(GV->getValueType(), GV, 0);
diff --git a/clang/lib/CodeGen/CGHLSLRuntime.h b/clang/lib/CodeGen/CGHLSLRuntime.h
index f9efb1bc996412..9c3bbf5bbfe190 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.h
+++ b/clang/lib/CodeGen/CGHLSLRuntime.h
@@ -15,6 +15,7 @@
 #ifndef LLVM_CLANG_LIB_CODEGEN_CGHLSLRUNTIME_H
 #define LLVM_CLANG_LIB_CODEGEN_CGHLSLRUNTIME_H
 
+#include "llvm/IR/GlobalVariable.h"
 #include "llvm/IR/IRBuilder.h"
 #include "llvm/IR/Intrinsics.h"
 #include "llvm/IR/IntrinsicsDirectX.h"
@@ -53,6 +54,7 @@ class StructType;
 } // namespace llvm
 
 namespace clang {
+class NamedDecl;
 class VarDecl;
 class ParmVarDecl;
 class HLSLBufferDecl;
@@ -112,22 +114,43 @@ class CGHLSLRuntime {
   //===----------------------------------------------------------------------===//
 
   struct BufferResBinding {
-    // The ID like 2 in register(b2, space1).
-    std::optional<unsigned> Reg;
-    // The Space like 1 is register(b2, space1).
-    // Default value is 0.
+    // Register slot
+    std::optional<unsigned> Slot;
+    // Register space; default value is 0.
     unsigned Space;
+
     BufferResBinding(HLSLResourceBindingAttr *Attr);
   };
+
+  struct BufferConstant {
+    llvm::GlobalVariable *GlobalVar;
+    // offset in memory layout (in bytes)
+    unsigned MemOffset;
+    // offset in cbuffer layout (in bytes)
+    unsigned CBufferOffset;
+
+    BufferConstant(llvm::GlobalVariable *GV)
+        : GlobalVar(GV), MemOffset(UINT_MAX), CBufferOffset(UINT_MAX) {}
+  };
+
   struct Buffer {
-    Buffer(const HLSLBufferDecl *D);
     llvm::StringRef Name;
-    // IsCBuffer - Whether the buffer is a cbuffer (and not a tbuffer).
+    // Whether the buffer is a cbuffer (and not a tbuffer).
     bool IsCBuffer;
-    BufferResBinding Binding;
-    // Global variable and offset for each constant.
-    std::vector<std::pair<llvm::GlobalVariable *, unsigned>> Constants;
-    llvm::StructType *LayoutStruct = nullptr;
+    // Whether the buffer has packoffset annotations
+    bool HasPackoffset;
+    // List of global constants with memory and cbuffer layout ofsets
+    std::vector<BufferConstant> Constants;
+    // LLVM layout type
+    llvm::StructType *LayoutStruct;
+    // size of the buffer in cbuffer layout
+    unsigned Size;
+    // reference to the AST Decl node with resource binding attribute
+    const HLSLBufferDecl *Decl;
+    // global variable for the constant buffer
+    llvm::GlobalVariable *GlobalVar;
+
+    Buffer(const HLSLBufferDecl *D);
   };
 
 protected:
@@ -169,7 +192,7 @@ class CGHLSLRuntime {
   llvm::Triple::ArchType getArch();
   llvm::SmallVector<Buffer> Buffers;
 
-  llvm::SmallVector<std::pair<const VarDecl *, llvm::GlobalVariable *>>
+  llvm::SmallVector<std::pair<const NamedDecl *, llvm::GlobalVariable *>>
       ResourcesToBind;
 };
 
diff --git a/clang/lib/Sema/SemaHLSL.cpp b/clang/lib/Sema/SemaHLSL.cpp
index 600c800029fd05..9b96806c35f3da 100644
--- a/clang/lib/Sema/SemaHLSL.cpp
+++ b/clang/lib/Sema/SemaHLSL.cpp
@@ -164,18 +164,18 @@ Decl *SemaHLSL::ActOnStartBuffer(Scope *BufferScope, bool CBuffer,
   return Result;
 }
 
-// Calculate the size of a legacy cbuffer type based on
+// Calculate the size of a legacy cbuffer type in bytes based on
 // https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-packing-rules
 static unsigned calculateLegacyCbufferSize(const ASTContext &Context,
                                            QualType T) {
   unsigned Size = 0;
-  constexpr unsigned CBufferAlign = 128;
+  constexpr unsigned CBufferAlign = 16;
   if (const RecordType *RT = T->getAs<RecordType>()) {
     const RecordDecl *RD = RT->getDecl();
     for (const FieldDecl *Field : RD->fields()) {
       QualType Ty = Field->getType();
       unsigned FieldSize = calculateLegacyCbufferSize(Context, Ty);
-      unsigned FieldAlign = 32;
+      unsigned FieldAlign = 4;
       if (Ty->isAggregateType())
         FieldAlign = CBufferAlign;
       Size = llvm::alignTo(Size, FieldAlign);
@@ -194,7 +194,7 @@ static unsigned calculateLegacyCbufferSize(const ASTContext &Context,
         calculateLegacyCbufferSize(Context, VT->getElementType());
     Size = ElementSize * ElementCount;
   } else {
-    Size = Context.getTypeSize(T);
+    Size = Context.getTypeSize(T) / 8;
   }
   return Size;
 }
@@ -229,16 +229,17 @@ void SemaHLSL::ActOnFinishBuffer(Decl *Dcl, SourceLocation RBrace) {
     std::sort(PackOffsetVec.begin(), PackOffsetVec.end(),
               [](const std::pair<VarDecl *, HLSLPackOffsetAttr *> &LHS,
                  const std::pair<VarDecl *, HLSLPackOffsetAttr *> &RHS) {
-                return LHS.second->getOffset() < RHS.second->getOffset();
+                return LHS.second->getOffsetInBytes() <
+                       RHS.second->getOffsetInBytes();
               });
 
     for (unsigned i = 0; i < PackOffsetVec.size() - 1; i++) {
       VarDecl *Var = PackOffsetVec[i].first;
       HLSLPackOffsetAttr *Attr = PackOffsetVec[i].second;
       unsigned Size = calculateLegacyCbufferS...
[truncated]

@hekota hekota changed the title [HLSL] Codegen for cbuffer blocks without embedded arrays or structs [HLSL] Codegen for cbuffer declarations without embedded arrays or structs Dec 12, 2024
if (const RecordType *RT = T->getAs<RecordType>()) {
const RecordDecl *RD = RT->getDecl();
for (const FieldDecl *Field : RD->fields()) {
QualType Ty = Field->getType();
unsigned FieldSize = calculateLegacyCbufferSize(Context, Ty);
unsigned FieldAlign = 32;
unsigned FieldAlign = 4;
if (Ty->isAggregateType())
Copy link
Member Author

@hekota hekota Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the correct alignment, it does not work for 16-bit and 64-bit types.
I have filed a bug here: #119641

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you put a comment there and reference the issue just to make it easier to keep track?

@hekota hekota marked this pull request as draft December 18, 2024 00:09
@hekota hekota marked this pull request as ready for review December 19, 2024 23:32
if (const RecordType *RT = T->getAs<RecordType>()) {
const RecordDecl *RD = RT->getDecl();
for (const FieldDecl *Field : RD->fields()) {
QualType Ty = Field->getType();
unsigned FieldSize = calculateLegacyCbufferSize(Context, Ty);
unsigned FieldAlign = 32;
unsigned FieldAlign = 4;
if (Ty->isAggregateType())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you put a comment there and reference the issue just to make it easier to keep track?

Comment on lines +893 to +895
if (getLangOpts().HLSL && getHLSLRuntime().needsResourceBindingInitFn()) {
CXXGlobalInits.push_back(getHLSLRuntime().createResourceBindingInitFn());
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably haven't seen this, but I filed an issue yesterday about how we were connecting initializers in the wrong place (#120636).

This is definitely better. I do wonder if we should be generating one binding function for all bindings (which would require collecting them and generating them late), or if we should be generating an initializer per binding and appending them to the CXXGlobalInits list as we process each binding. While the later produces more functions that need to be inlined, I'm inclined to prefer it because it is simpler and aligns more with a traditional RAII-model for resource initialization.

The issue I filed is probably redundant since you're still reworking all of this.

CXXRecordDecl *StructDecl = CXXRecordDecl::Create(
BD->getASTContext(), TagDecl::TagKind::Class, BD->getDeclContext(),
BD->getLocation(), BD->getLocation(), BD->getIdentifier());
StructDecl->startDefinition();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This concerns me. We really shouldn't be generating AST types during CodeGen.

We should generate the struct decl during Sema and attach it to the buffer decl so that we can dump and inspect the AST.

++ConstIt, ++Index) {
GlobalVariable *ConstGV = *ConstIt;

// TODO: Map to an hlsl_device address space.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// TODO: Map to an hlsl_device address space.
// TODO: Map to an hlsl_device address space (#109877).

GV->eraseFromParent();
// Replaces all uses of the temporary constant buffer global variables with
// buffer access intrinsic resource.getpointer and GEP.
static void replaceBufferGlobals(CodeGenModule &CGM,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a big no. This is falling into the mess that was CGHLSLMSFinishCodeGen in DXC.

We should not be manipulating IR in Clang CodeGen. We should be able to inspect and dump the post-codegen IR, and test every IR transformation independently.

@hekota hekota marked this pull request as draft December 20, 2024 22:36
@hekota hekota closed this Jan 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:DirectX clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category HLSL HLSL Language Support
Projects
Archived in project
4 participants