Skip to content

Commit aec685e

Browse files
authored
[DataLayout] Introduce DataLayout::getAddressSize(AS)
This function can be used to retrieve the number of bits that can be used for arithmetic in a given address space (i.e. the range of the address space). For most in-tree targets this should not make any difference but differentiating between the size of a pointer in bits and the address range is extremely important e.g. for CHERI-enabled targets, where pointers carry additional metadata such as bounds and permissions and only a subset of the pointer bits is used as the address. The address size is defined to be the same as the index size. We considered adding a separate property since targets exist where indexing and address range actually use different sizes (AMDGPU fat pointers with 160 representation, 48 bit address and 32 bit index), but for the purposes of LLVM semantics, differentiating them does not add much value and it introduces a lot of complexity in ensure the correct bits are used. See the reasoning by @nikic on https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/38https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/49 Originally uploaded as https://reviews.llvm.org/D135158 Reviewed By: davidchisnall, krzysz00 Pull Request: #139347
1 parent e620f10 commit aec685e

File tree

2 files changed

+87
-17
lines changed

2 files changed

+87
-17
lines changed

llvm/docs/LangRef.rst

Lines changed: 25 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -3150,14 +3150,21 @@ as follows:
31503150
``A<address space>``
31513151
Specifies the address space of objects created by '``alloca``'.
31523152
Defaults to the default address space of 0.
3153-
``p[n]:<size>:<abi>[:<pref>][:<idx>]``
3154-
This specifies the *size* of a pointer and its ``<abi>`` and
3155-
``<pref>``\erred alignments for address space ``n``.
3156-
The fourth parameter ``<idx>`` is the size of the
3157-
index that used for address calculation, which must be less than or equal
3158-
to the pointer size. If not
3159-
specified, the default index size is equal to the pointer size. All sizes
3160-
are in bits. The address space, ``n``, is optional, and if not specified,
3153+
``p[n]:<size>:<abi>[:<pref>[:<idx>]]``
3154+
This specifies the properties of a pointer in address space ``n``.
3155+
The ``<size>`` parameter specifies the size of the bitwise representation.
3156+
For :ref:`non-integral pointers <nointptrtype>` the representation size may
3157+
be larger than the address width of the underlying address space (e.g. to
3158+
accommodate additional metadata).
3159+
The alignment requirements are specified via the ``<abi>`` and
3160+
``<pref>``\erred alignments parameters.
3161+
The fourth parameter ``<idx>`` is the size of the index that used for
3162+
address calculations such as :ref:`getelementptr <i_getelementptr>`.
3163+
It must be less than or equal to the pointer size. If not specified, the
3164+
default index size is equal to the pointer size.
3165+
The index size also specifies the width of addresses in this address space.
3166+
All sizes are in bits.
3167+
The address space, ``n``, is optional, and if not specified,
31613168
denotes the default address space 0. The value of ``n`` must be
31623169
in the range [1,2^24).
31633170
``i<size>:<abi>[:<pref>]``
@@ -4269,6 +4276,16 @@ address spaces defined in the :ref:`datalayout string<langref_datalayout>`.
42694276
the default globals address space and ``addrspace("P")`` the program address
42704277
space.
42714278

4279+
The representation of pointers can be different for each address space and does
4280+
not necessarily need to be a plain integer address (e.g. for
4281+
:ref:`non-integral pointers <nointptrtype>`). In addition to a representation
4282+
bits size, pointers in each address space also have an index size which defines
4283+
the bitwidth of indexing operations as well as the size of `integer addresses`
4284+
in this address space. For example, CHERI capabilities are twice the size of the
4285+
underlying addresses to accommodate for additional metadata such as bounds and
4286+
permissions: on a 32-bit system the bitwidth of the pointer representation size
4287+
is 64, but the underlying address width remains 32 bits.
4288+
42724289
The default address space is number zero.
42734290

42744291
The semantics of non-zero address spaces are target-specific. Memory

llvm/include/llvm/IR/DataLayout.h

Lines changed: 62 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,7 @@ class DataLayout {
9292
/// The function pointer alignment is a multiple of the function alignment.
9393
MultipleOfFunctionAlign,
9494
};
95+
9596
private:
9697
bool BigEndian = false;
9798

@@ -324,16 +325,38 @@ class DataLayout {
324325
/// the backends/clients are updated.
325326
Align getPointerPrefAlignment(unsigned AS = 0) const;
326327

327-
/// Layout pointer size in bytes, rounded up to a whole
328-
/// number of bytes.
328+
/// The pointer representation size in bytes, rounded up to a whole number of
329+
/// bytes. The difference between this function and getAddressSize() is that
330+
/// this one returns the size of the entire pointer representation (including
331+
/// metadata bits for fat pointers) and the latter only returns the number of
332+
/// address bits.
333+
/// \sa DataLayout::getAddressSizeInBits
329334
/// FIXME: The defaults need to be removed once all of
330335
/// the backends/clients are updated.
331336
unsigned getPointerSize(unsigned AS = 0) const;
332337

333-
// Index size in bytes used for address calculation,
334-
/// rounded up to a whole number of bytes.
338+
/// The index size in bytes used for address calculation, rounded up to a
339+
/// whole number of bytes. This not only defines the size used in
340+
/// getelementptr operations, but also the size of addresses in this \p AS.
341+
/// For example, a 64-bit CHERI-enabled target has 128-bit pointers of which
342+
/// only 64 are used to represent the address and the remaining ones are used
343+
/// for metadata such as bounds and access permissions. In this case
344+
/// getPointerSize() returns 16, but getIndexSize() returns 8.
345+
/// To help with code understanding, the alias getAddressSize() can be used
346+
/// instead of getIndexSize() to clarify that an address width is needed.
335347
unsigned getIndexSize(unsigned AS) const;
336348

349+
/// The integral size of a pointer in a given address space in bytes, which
350+
/// is defined to be the same as getIndexSize(). This exists as a separate
351+
/// function to make it clearer when reading code that the size of an address
352+
/// is being requested. While targets exist where index size and the
353+
/// underlying address width are not identical (e.g. AMDGPU fat pointers with
354+
/// 48-bit addresses and 32-bit offsets indexing), there is currently no need
355+
/// to differentiate these properties in LLVM.
356+
/// \sa DataLayout::getIndexSize
357+
/// \sa DataLayout::getAddressSizeInBits
358+
unsigned getAddressSize(unsigned AS) const { return getIndexSize(AS); }
359+
337360
/// Return the address spaces containing non-integral pointers. Pointers in
338361
/// this address space don't have a well-defined bitwise representation.
339362
SmallVector<unsigned, 8> getNonIntegralAddressSpaces() const {
@@ -358,29 +381,53 @@ class DataLayout {
358381
return PTy && isNonIntegralPointerType(PTy);
359382
}
360383

361-
/// Layout pointer size, in bits
384+
/// The size in bits of the pointer representation in a given address space.
385+
/// This is not necessarily the same as the integer address of a pointer (e.g.
386+
/// for fat pointers).
387+
/// \sa DataLayout::getAddressSizeInBits()
362388
/// FIXME: The defaults need to be removed once all of
363389
/// the backends/clients are updated.
364390
unsigned getPointerSizeInBits(unsigned AS = 0) const {
365391
return getPointerSpec(AS).BitWidth;
366392
}
367393

368-
/// Size in bits of index used for address calculation in getelementptr.
394+
/// The size in bits of indices used for address calculation in getelementptr
395+
/// and for addresses in the given AS. See getIndexSize() for more
396+
/// information.
397+
/// \sa DataLayout::getAddressSizeInBits()
369398
unsigned getIndexSizeInBits(unsigned AS) const {
370399
return getPointerSpec(AS).IndexBitWidth;
371400
}
372401

373-
/// Layout pointer size, in bits, based on the type. If this function is
402+
/// The size in bits of an address in for the given AS. This is defined to
403+
/// return the same value as getIndexSizeInBits() since there is currently no
404+
/// target that requires these two properties to have different values. See
405+
/// getIndexSize() for more information.
406+
/// \sa DataLayout::getIndexSizeInBits()
407+
unsigned getAddressSizeInBits(unsigned AS) const {
408+
return getIndexSizeInBits(AS);
409+
}
410+
411+
/// The pointer representation size in bits for this type. If this function is
374412
/// called with a pointer type, then the type size of the pointer is returned.
375413
/// If this function is called with a vector of pointers, then the type size
376414
/// of the pointer is returned. This should only be called with a pointer or
377415
/// vector of pointers.
378416
unsigned getPointerTypeSizeInBits(Type *) const;
379417

380-
/// Layout size of the index used in GEP calculation.
418+
/// The size in bits of the index used in GEP calculation for this type.
381419
/// The function should be called with pointer or vector of pointers type.
420+
/// This is defined to return the same value as getAddressSizeInBits(),
421+
/// but separate functions exist for code clarity.
382422
unsigned getIndexTypeSizeInBits(Type *Ty) const;
383423

424+
/// The size in bits of an address for this type.
425+
/// This is defined to return the same value as getIndexTypeSizeInBits(),
426+
/// but separate functions exist for code clarity.
427+
unsigned getAddressSizeInBits(Type *Ty) const {
428+
return getIndexTypeSizeInBits(Ty);
429+
}
430+
384431
unsigned getPointerTypeSize(Type *Ty) const {
385432
return getPointerTypeSizeInBits(Ty) / 8;
386433
}
@@ -515,15 +562,21 @@ class DataLayout {
515562
/// are set.
516563
unsigned getLargestLegalIntTypeSizeInBits() const;
517564

518-
/// Returns the type of a GEP index in AddressSpace.
565+
/// Returns the type of a GEP index in \p AddressSpace.
519566
/// If it was not specified explicitly, it will be the integer type of the
520567
/// pointer width - IntPtrType.
521568
IntegerType *getIndexType(LLVMContext &C, unsigned AddressSpace) const;
569+
/// Returns the type of an address in \p AddressSpace
570+
IntegerType *getAddressType(LLVMContext &C, unsigned AddressSpace) const {
571+
return getIndexType(C, AddressSpace);
572+
}
522573

523574
/// Returns the type of a GEP index.
524575
/// If it was not specified explicitly, it will be the integer type of the
525576
/// pointer width - IntPtrType.
526577
Type *getIndexType(Type *PtrTy) const;
578+
/// Returns the type of an address in \p AddressSpace
579+
Type *getAddressType(Type *PtrTy) const { return getIndexType(PtrTy); }
527580

528581
/// Returns the offset from the beginning of the type for the specified
529582
/// indices.

0 commit comments

Comments
 (0)