Skip to content

[DataLayout] Introduce DataLayout::getPointerAddressSize(AS) #137412

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 21 additions & 8 deletions llvm/docs/LangRef.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3147,14 +3147,27 @@ as follows:
``A<address space>``
Specifies the address space of objects created by '``alloca``'.
Defaults to the default address space of 0.
``p[n]:<size>:<abi>[:<pref>][:<idx>]``
This specifies the *size* of a pointer and its ``<abi>`` and
``<pref>``\erred alignments for address space ``n``.
The fourth parameter ``<idx>`` is the size of the
index that used for address calculation, which must be less than or equal
to the pointer size. If not
specified, the default index size is equal to the pointer size. All sizes
are in bits. The address space, ``n``, is optional, and if not specified,
``p[n]:<size>:<abi>[:<pref>[:<idx>[:<addr>]]]``
This specifies the properties of a pointer in address space ``n``.
The ``<size>`` parameter specifies the size of the bitwise representation.
For :ref:`non-integral pointers <nointptrtype>` the representation size may
be larger than the address width of the underlying address space (e.g. to
accommodate additional metadata).
The alignment requirements are specified via the ``<abi>`` and
``<pref>``\erred alignments parameters.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to get too bikeshed-y on this, but I find the pointer size vs address size terminology confusing at a glance, and I worry that we'll end up with a lot of pieces in the code base inadvertently using the wrong one.

If you feel strongly that this is the best terminology, I won't hold it up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this could be confusing but I haven't been able to come up with a better name. Maybe pointer size should be "representation size"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally used pointer range in CHERI LLVM, but I wasn't happy with it because it's actually log2(pointer range). Address size is more correct, at least.

The fourth parameter ``<idx>`` is the size of the index that used for
address calculations such as :ref:`getelementptr <i_getelementptr>`.
It must be less than or equal to the pointer size. If not specified, the
default index size is equal to the pointer size.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be useful to explicitly document and enforce a constraint of the form size >= addr >= idx.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good - If we ever end up supporting a target where that does not hold we can always adjust it again. E.g. some hypothetical target where addresses are 64 bits but the pointer representation is a 32-bit offset relative to some base address.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... Isn't that the AMDGPU stack pointer in some of its implementations, even though you're not supposed to think of it like that? (Opinions have varied over the generations on if it's "32-bit offset from buffer resource the compiler creates" to "32-bit offset from implicit scratch base" and so on)? Though I'd need more consensus from AMDGPU forks on if we ever want to expose the lie

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine that within the stack address space you have 32-bit range and the 32-bit offset from the fixed base uniquely identifies the underlying address?
In that case I don't think you'd need to expose the underlying "real address" for ptrtoaddr. I'd view that as being similar to running a 32-bit program on 64-bit kernel where the virtual memory system translates those 32-bit addresses to a final 64 (well likely less) bit physical address.

I'm not sure if there is any advantage in exposing the 48-bit (?) final address, but if you think there is we can definitely relax this restriction.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's much virtue to it, I'm just asking annoying questions

The fifth parameter ``<addr>`` specifies the width of addresses in this
address space. If not specified, the default address size is equal to the
index size. The address size may be wider than the index size as it could be
calculated relative to a base address. For example AMDGPU buffer fat
pointers use a 48-bit address range, but only allow for 32 bits of indexing.
All sizes are in bits.
The following relationship is assumed to be true:
``representation size >= address size >= index size``.
The address space, ``n``, is optional, and if not specified,
denotes the default address space 0. The value of ``n`` must be
in the range [1,2^24).
``i<size>:<abi>[:<pref>]``
Expand Down
29 changes: 26 additions & 3 deletions llvm/include/llvm/IR/DataLayout.h
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,7 @@ class DataLayout {
Align ABIAlign;
Align PrefAlign;
uint32_t IndexBitWidth;
uint32_t AddressBitWidth;
/// Pointers in this address space don't have a well-defined bitwise
/// representation (e.g. may be relocated by a copying garbage collector).
/// Additionally, they may also be non-integral (i.e. containing additional
Expand Down Expand Up @@ -148,7 +149,7 @@ class DataLayout {
/// Sets or updates the specification for pointer in the given address space.
void setPointerSpec(uint32_t AddrSpace, uint32_t BitWidth, Align ABIAlign,
Align PrefAlign, uint32_t IndexBitWidth,
bool IsNonIntegral);
uint32_t AddressBitWidth, bool IsNonIntegral);

/// Internal helper to get alignment for integer of given bitwidth.
Align getIntegerAlignment(uint32_t BitWidth, bool abi_or_pref) const;
Expand Down Expand Up @@ -324,12 +325,26 @@ class DataLayout {
/// the backends/clients are updated.
Align getPointerPrefAlignment(unsigned AS = 0) const;

/// Layout pointer size in bytes, rounded up to a whole
/// number of bytes.
/// Layout pointer size in bytes, rounded up to a whole number of bytes. The
/// difference between this function and getPointerAddressSize() is this one
/// returns the size of the entire pointer type (this includes metadata bits
/// for fat pointers) and the latter only returns the number of address bits.
/// \sa DataLayout::getPointerAddressSizeInBits
/// FIXME: The defaults need to be removed once all of
/// the backends/clients are updated.
unsigned getPointerSize(unsigned AS = 0) const;

/// Returns the integral size of a pointer in a given address space in bytes.
/// For targets that store bits in pointers that are not part of the address,
/// this returns the number of bits that can be manipulated using operations
/// that change the address (e.g. addition/subtraction).
/// For example, a 64-bit CHERI-enabled target has 128-bit pointers of which
/// only 64 are used to represent the address and the remaining ones are used
/// for metadata such as bounds and access permissions. In this case
/// getPointerSize() returns 16, but getPointerAddressSize() returns 8.
/// \sa DataLayout::getPointerSize
unsigned getPointerAddressSize(unsigned AS) const;

// Index size in bytes used for address calculation,
/// rounded up to a whole number of bytes.
unsigned getIndexSize(unsigned AS) const;
Expand Down Expand Up @@ -365,6 +380,14 @@ class DataLayout {
return getPointerSpec(AS).BitWidth;
}

/// The size of an address in for the given AS. This is usually the same size
/// as the index width but in same cases (e.g. AMDGPU buffer fat pointers with
/// 48-bit addresses and 32-bit offsets), the address size can be larger than
/// the valid range of indexing.
unsigned getPointerAddressSizeInBits(unsigned AS) const {
return getPointerSpec(AS).AddressBitWidth;
}

/// Size in bits of index used for address calculation in getelementptr.
unsigned getIndexSizeInBits(unsigned AS) const {
return getPointerSpec(AS).IndexBitWidth;
Expand Down
36 changes: 29 additions & 7 deletions llvm/lib/IR/DataLayout.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,7 @@ bool DataLayout::PointerSpec::operator==(const PointerSpec &Other) const {
return AddrSpace == Other.AddrSpace && BitWidth == Other.BitWidth &&
ABIAlign == Other.ABIAlign && PrefAlign == Other.PrefAlign &&
IndexBitWidth == Other.IndexBitWidth &&
AddressBitWidth == Other.AddressBitWidth &&
IsNonIntegral == Other.IsNonIntegral;
}

Expand Down Expand Up @@ -208,7 +209,7 @@ constexpr DataLayout::PrimitiveSpec DefaultVectorSpecs[] = {
// Default pointer type specifications.
constexpr DataLayout::PointerSpec DefaultPointerSpecs[] = {
// p0:64:64:64:64
{0, 64, Align::Constant<8>(), Align::Constant<8>(), 64, false},
{0, 64, Align::Constant<8>(), Align::Constant<8>(), 64, 64, false},
};

DataLayout::DataLayout()
Expand Down Expand Up @@ -414,8 +415,9 @@ Error DataLayout::parsePointerSpec(StringRef Spec) {
assert(Spec.front() == 'p');
Spec.drop_front().split(Components, ':');

if (Components.size() < 3 || Components.size() > 5)
return createSpecFormatError("p[<n>]:<size>:<abi>[:<pref>[:<idx>]]");
if (Components.size() < 3 || Components.size() > 6)
return createSpecFormatError(
"p[<n>]:<size>:<abi>[:<pref>[:<idx>[:<addr>]]]");

// Address space. Optional, defaults to 0.
unsigned AddrSpace = 0;
Expand Down Expand Up @@ -454,8 +456,21 @@ Error DataLayout::parsePointerSpec(StringRef Spec) {
return createStringError(
"index size cannot be larger than the pointer size");

unsigned AddressBitWidth = IndexBitWidth;
if (Components.size() > 5)
if (Error Err = parseSize(Components[5], AddressBitWidth, "address size"))
return Err;

if (AddressBitWidth > BitWidth)
return createStringError(
"address size cannot be larger than the pointer size");

if (IndexBitWidth > AddressBitWidth)
return createStringError(
"index size cannot be larger than the address size");

setPointerSpec(AddrSpace, BitWidth, ABIAlign, PrefAlign, IndexBitWidth,
false);
AddressBitWidth, false);
return Error::success();
}

Expand Down Expand Up @@ -631,7 +646,7 @@ Error DataLayout::parseLayoutString(StringRef LayoutString) {
// the spec for AS0, and we then update that to mark it non-integral.
const PointerSpec &PS = getPointerSpec(AS);
setPointerSpec(AS, PS.BitWidth, PS.ABIAlign, PS.PrefAlign, PS.IndexBitWidth,
true);
PS.AddressBitWidth, true);
}

return Error::success();
Expand Down Expand Up @@ -679,16 +694,19 @@ DataLayout::getPointerSpec(uint32_t AddrSpace) const {

void DataLayout::setPointerSpec(uint32_t AddrSpace, uint32_t BitWidth,
Align ABIAlign, Align PrefAlign,
uint32_t IndexBitWidth, bool IsNonIntegral) {
uint32_t IndexBitWidth,
uint32_t AddressBitWidth, bool IsNonIntegral) {
auto I = lower_bound(PointerSpecs, AddrSpace, LessPointerAddrSpace());
if (I == PointerSpecs.end() || I->AddrSpace != AddrSpace) {
PointerSpecs.insert(I, PointerSpec{AddrSpace, BitWidth, ABIAlign, PrefAlign,
IndexBitWidth, IsNonIntegral});
IndexBitWidth, AddressBitWidth,
IsNonIntegral});
} else {
I->BitWidth = BitWidth;
I->ABIAlign = ABIAlign;
I->PrefAlign = PrefAlign;
I->IndexBitWidth = IndexBitWidth;
I->AddressBitWidth = AddressBitWidth;
I->IsNonIntegral = IsNonIntegral;
}
}
Expand Down Expand Up @@ -728,6 +746,10 @@ const StructLayout *DataLayout::getStructLayout(StructType *Ty) const {
return L;
}

unsigned DataLayout::getPointerAddressSize(unsigned AS) const {
return divideCeil(getPointerAddressSizeInBits(AS), 8);
}

Align DataLayout::getPointerABIAlignment(unsigned AS) const {
return getPointerSpec(AS).ABIAlign;
}
Expand Down
65 changes: 62 additions & 3 deletions llvm/unittests/IR/DataLayoutTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -312,12 +312,12 @@ TEST(DataLayout, ParsePointerSpec) {
"p16777215:32:32:64:8", "p16777215:16777215:32768:32768:16777215"})
EXPECT_THAT_EXPECTED(DataLayout::parse(Str), Succeeded());

for (StringRef Str :
{"p", "p0", "p:32", "p0:32", "p:32:32:32:32:32", "p0:32:32:32:32:32"})
for (StringRef Str : {"p", "p0", "p:32", "p0:32", "p:32:32:32:32:32:32",
"p0:32:32:32:32:32:32"})
EXPECT_THAT_EXPECTED(
DataLayout::parse(Str),
FailedWithMessage("malformed specification, must be of the form "
"\"p[<n>]:<size>:<abi>[:<pref>[:<idx>]]\""));
"\"p[<n>]:<size>:<abi>[:<pref>[:<idx>[:<addr>]]]\""));

// address space
for (StringRef Str : {"p0x0:32:32", "px:32:32:32", "p16777216:32:32:32:32"})
Expand Down Expand Up @@ -401,6 +401,31 @@ TEST(DataLayout, ParsePointerSpec) {
EXPECT_THAT_EXPECTED(
DataLayout::parse(Str),
FailedWithMessage("index size cannot be larger than the pointer size"));

EXPECT_THAT_EXPECTED(
DataLayout::parse("p:64:64:64:48:32"),
FailedWithMessage("index size cannot be larger than the address size"));

// address size
for (StringRef Str : {"p:64:32:32:64:", "p0:64:32:32:64:"})
EXPECT_THAT_EXPECTED(
DataLayout::parse(Str),
FailedWithMessage("address size component cannot be empty"));

// Note: in the future we might allow 0 for address size to indicate pointers
// that do not have a meaning full address (e.g. relocatable GC pointers).
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// that do not have a meaning full address (e.g. relocatable GC pointers).
// that do not have a meaningful address (e.g. relocatable GC pointers).

for (StringRef Str :
{"p:32:32:32:32:0", "p0:32:32:32:32:0x20", "p42:32:32:32:32:16777216"})
EXPECT_THAT_EXPECTED(
DataLayout::parse(Str),
FailedWithMessage("address size must be a non-zero 24-bit integer"));

for (StringRef Str :
{"p:16:16:16:16:17", "p0:32:64:64:32:64", "p42:16:64:64:16:32"})
EXPECT_THAT_EXPECTED(
DataLayout::parse(Str),
FailedWithMessage(
"address size cannot be larger than the pointer size"));
}

TEST(DataLayoutTest, ParseNativeIntegersSpec) {
Expand Down Expand Up @@ -523,6 +548,40 @@ TEST(DataLayout, GetIndexSize) {
}
}

TEST(DataLayout, GetAddressSizeInBits) {
// Address size defaults to index size
std::tuple<StringRef, unsigned, unsigned, unsigned> Cases[] = {
{"", 64, 64, 64},
{"p:16:32", 16, 16, 16},
{"p0:32:64", 32, 32, 32},
{"p1:16:32:32:10", 64, 10, 64},
{"p1:31:32:64:10:20-p2:17:16:16:15:15", 64, 20, 15},
};
for (auto [Layout, V0, V1, V2] : Cases) {
DataLayout DL = cantFail(DataLayout::parse(Layout));
EXPECT_EQ(DL.getPointerAddressSizeInBits(0), V0) << Layout;
EXPECT_EQ(DL.getPointerAddressSizeInBits(1), V1) << Layout;
EXPECT_EQ(DL.getPointerAddressSizeInBits(2), V2) << Layout;
}
}

TEST(DataLayout, GetAddressSize) {
// Address size defaults to index size
std::tuple<StringRef, unsigned, unsigned, unsigned> Cases[] = {
{"", 8, 8, 8},
{"p:16:32", 2, 2, 2},
{"p0:27:64", 4, 4, 4},
{"p1:19:32:64:2:5", 8, 1, 8},
{"p1:33:32:64:10:23-p2:21:8:16:10:13", 8, 3, 2},
};
for (auto [Layout, V0, V1, V2] : Cases) {
DataLayout DL = cantFail(DataLayout::parse(Layout));
EXPECT_EQ(DL.getPointerAddressSize(0), V0) << Layout;
EXPECT_EQ(DL.getPointerAddressSize(1), V1) << Layout;
EXPECT_EQ(DL.getPointerAddressSize(2), V2) << Layout;
}
}

TEST(DataLayout, GetPointerABIAlignment) {
std::tuple<StringRef, unsigned, unsigned, unsigned> Cases[] = {
{"", 8, 8, 8},
Expand Down
2 changes: 1 addition & 1 deletion mlir/include/mlir/Dialect/LLVMIR/LLVMTypes.h
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ Type getVectorType(Type elementType, const llvm::ElementCount &numElements);
llvm::TypeSize getPrimitiveTypeSizeInBits(Type type);

/// The positions of different values in the data layout entry for pointers.
enum class PtrDLEntryPos { Size = 0, Abi = 1, Preferred = 2, Index = 3 };
enum class PtrDLEntryPos { Size = 0, Abi, Preferred, Index, Addr };

/// Returns the value that corresponds to named position `pos` from the
/// data layout entry `attr` assuming it's a dense integer elements attribute.
Expand Down
22 changes: 13 additions & 9 deletions mlir/lib/Dialect/LLVMIR/IR/LLVMTypes.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -293,20 +293,24 @@ getPointerDataLayoutEntry(DataLayoutEntryListRef params, LLVMPointerType type,
if (currentEntry) {
std::optional<uint64_t> value = extractPointerSpecValue(currentEntry, pos);
// If the optional `PtrDLEntryPos::Index` entry is not available, use the
// pointer size as the index bitwidth.
// pointer size as the index bitwidth. Similarly, address width defaults to
// index width
if (!value && pos == PtrDLEntryPos::Index)
value = extractPointerSpecValue(currentEntry, PtrDLEntryPos::Size);
bool isSizeOrIndex =
pos == PtrDLEntryPos::Size || pos == PtrDLEntryPos::Index;
return *value / (isSizeOrIndex ? 1 : kBitsInByte);
else if (!value && pos == PtrDLEntryPos::Addr)
value = extractPointerSpecValue(currentEntry, PtrDLEntryPos::Index);
// Convert alignment values to bytes instead of bits
if (pos == PtrDLEntryPos::Abi || pos == PtrDLEntryPos::Preferred)
return *value / kBitsInByte;
return *value;
}

// If not found, and this is the pointer to the default memory space, assume
// 64-bit pointers.
if (type.getAddressSpace() == 0) {
bool isSizeOrIndex =
pos == PtrDLEntryPos::Size || pos == PtrDLEntryPos::Index;
return isSizeOrIndex ? kDefaultPointerSizeBits : kDefaultPointerAlignment;
bool isAlignment =
pos == PtrDLEntryPos::Abi || pos == PtrDLEntryPos::Preferred;
return isAlignment ? kDefaultPointerAlignment : kDefaultPointerSizeBits;
}

return std::nullopt;
Expand Down Expand Up @@ -401,10 +405,10 @@ LogicalResult LLVMPointerType::verifyEntries(DataLayoutEntryListRef entries,
continue;
auto key = llvm::cast<Type>(entry.getKey());
auto values = llvm::dyn_cast<DenseIntElementsAttr>(entry.getValue());
if (!values || (values.size() != 3 && values.size() != 4)) {
if (!values || (values.size() < 3 && values.size() > 5)) {
return emitError(loc)
<< "expected layout attribute for " << key
<< " to be a dense integer elements attribute with 3 or 4 "
<< " to be a dense integer elements attribute with 3 to 5 "
"elements";
}
if (!values.getElementType().isInteger(64))
Expand Down
19 changes: 11 additions & 8 deletions mlir/lib/Target/LLVMIR/DataLayoutImporter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -101,22 +101,25 @@ DataLayoutImporter::tryToParsePointerAlignment(StringRef token) const {
FailureOr<SmallVector<uint64_t>> alignment = tryToParseIntList(token);
if (failed(alignment))
return failure();
if (alignment->size() < 2 || alignment->size() > 4)
if (alignment->size() < 2 || alignment->size() > 5)
return failure();

// Pointer alignment specifications (such as 64:32:64:32 or 32:32) are of
// the form <size>:<abi>[:<pref>][:<idx>], where size is the pointer size, abi
// specifies the minimal alignment, pref the optional preferred alignment, and
// idx the optional index computation bit width. The preferred alignment is
// set to the minimal alignment if not available and the index computation
// width is set to the pointer size if not available.
// the form <size>:<abi>[:<pref>][:<idx>][:<addr>], where size is the pointer
// size, abi specifies the minimal alignment, pref the optional preferred
// alignment, and idx the optional index computation bit width, addr is the
// width of an integer address in this address space.
// The preferred alignment is set to the minimal alignment if not available
// and the index computation width is set to the pointer size if not
// available. The address size defaults to the index size.
uint64_t size = (*alignment)[0];
uint64_t minimal = (*alignment)[1];
uint64_t preferred = alignment->size() < 3 ? minimal : (*alignment)[2];
uint64_t idx = alignment->size() < 4 ? size : (*alignment)[3];
uint64_t addr = alignment->size() < 5 ? idx : (*alignment)[4];
return DenseIntElementsAttr::get<uint64_t>(
VectorType::get({4}, IntegerType::get(context, 64)),
{size, minimal, preferred, idx});
VectorType::get({5}, IntegerType::get(context, 64)),
{size, minimal, preferred, idx, addr});
}

LogicalResult DataLayoutImporter::tryToEmplaceAlignmentEntry(Type type,
Expand Down
2 changes: 1 addition & 1 deletion mlir/test/Dialect/LLVMIR/layout.mlir
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,7 @@ module attributes { dlti.dl_spec = #dlti.dl_spec<

// -----

// expected-error@below {{expected layout attribute for '!llvm.ptr' to be a dense integer elements attribute with 3 or 4 elements}}
// expected-error@below {{expected layout attribute for '!llvm.ptr' to be a dense integer elements attribute with 3 to 5 elements}}
module attributes { dlti.dl_spec = #dlti.dl_spec<
#dlti.dl_entry<!llvm.ptr, dense<[64.0, 64.0, 64.0]> : vector<3xf32>>
>} {
Expand Down
Loading