Skip to content

Commit 62d9497

Browse files
authored
AMDGPU: Add description for new atomicrmw metadata (#85052)
Add a spec for yet-to-be-implemented metadata to allow the backend to fully handle atomicrmw lowering. This is the base of an alternative to #69229, which inverts the direction to be correct by default, and extends to cover the peer device case.
1 parent 51fac77 commit 62d9497

File tree

2 files changed

+80
-0
lines changed

2 files changed

+80
-0
lines changed

llvm/docs/AMDGPUUsage.rst

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1394,6 +1394,84 @@ arguments.
13941394

13951395
%val = load i32, ptr %in, align 4, !amdgpu.last.use !{}
13961396

1397+
'``amdgpu.no.remote.memory``' Metadata
1398+
---------------------------------------------
1399+
1400+
Asserts a memory operation does not access bytes in host memory, or
1401+
remote connected peer device memory (the address must be device
1402+
local). This is intended for use with :ref:`atomicrmw <i_atomicrmw>`
1403+
and other atomic instructions. This is required to emit a native
1404+
hardware instruction for some :ref:`system scope
1405+
<amdgpu-memory-scopes>` atomic operations on some subtargets. For most
1406+
integer atomic operations, this is a sufficient restriction to emit a
1407+
native atomic instruction.
1408+
1409+
An :ref:`atomicrmw <i_atomicrmw>` without metadata will be treated
1410+
conservatively as required to preserve the operation behavior in all
1411+
cases. This will typically be used in conjunction with
1412+
:ref:`\!amdgpu.no.fine.grained.memory<amdgpu_no_fine_grained_memory>`.
1413+
1414+
1415+
.. code-block:: llvm
1416+
1417+
; Indicates the atomic does not access fine-grained memory, or
1418+
; remote device memory.
1419+
%old0 = atomicrmw sub ptr %ptr0, i32 1 acquire, !amdgpu.no.fine.grained.memory !0, !amdgpu.no.remote.memory !0
1420+
1421+
; Indicates the atomic does not access peer device memory.
1422+
%old2 = atomicrmw sub ptr %ptr2, i32 1 acquire, !amdgpu.no.remote.memory !0
1423+
1424+
!0 = !{}
1425+
1426+
.. _amdgpu_no_fine_grained_memory:
1427+
1428+
'``amdgpu.no.fine.grained.memory``' Metadata
1429+
-------------------------------------------------
1430+
1431+
Asserts a memory access does not access bytes allocated in
1432+
fine-grained allocated memory. This is intended for use with
1433+
:ref:`atomicrmw <i_atomicrmw>` and other atomic instructions. This is
1434+
required to emit a native hardware instruction for some :ref:`system
1435+
scope <amdgpu-memory-scopes>` atomic operations on some subtargets. An
1436+
:ref:`atomicrmw <i_atomicrmw>` without metadata will be treated
1437+
conservatively as required to preserve the operation behavior in all
1438+
cases. This will typically be used in conjunction with
1439+
:ref:`\!amdgpu.no.remote.memory.access<amdgpu_no_remote_memory_access>`.
1440+
1441+
.. code-block:: llvm
1442+
1443+
; Indicates the access does not access fine-grained memory, or
1444+
; remote device memory.
1445+
%old0 = atomicrmw sub ptr %ptr0, i32 1 acquire, !amdgpu.no.fine.grained.memory !0, !amdgpu.no.remote.memory.access !0
1446+
1447+
; Indicates the access does not access fine-grained memory
1448+
%old2 = atomicrmw sub ptr %ptr2, i32 1 acquire, !amdgpu.no.fine.grained.memory !0
1449+
1450+
!0 = !{}
1451+
1452+
.. _amdgpu_no_remote_memory_access:
1453+
1454+
'``amdgpu.ignore.denormal.mode``' Metadata
1455+
------------------------------------------
1456+
1457+
For use with :ref:`atomicrmw <i_atomicrmw>` floating-point
1458+
operations. Indicates the handling of denormal inputs and results is
1459+
insignificant and may be inconsistent with the expected floating-point
1460+
mode. This is necessary to emit a native atomic instruction on some
1461+
targets for some address spaces where float denormals are
1462+
unconditionally flushed. This is typically used in conjunction with
1463+
:ref:`\!amdgpu.no.remote.memory.access<amdgpu_no_remote_memory_access>`
1464+
and
1465+
:ref:`\!amdgpu.no.fine.grained.memory<amdgpu_no_fine_grained_memory>`
1466+
1467+
1468+
.. code-block:: llvm
1469+
1470+
%res0 = atomicrmw fadd ptr addrspace(1) %ptr, float %value seq_cst, align 4, !amdgpu.ignore.denormal.mode !0
1471+
%res1 = atomicrmw fadd ptr addrspace(1) %ptr, float %value seq_cst, align 4, !amdgpu.ignore.denormal.mode !0, !amdgpu.no.fine.grained.memory !0, !amdgpu.no.remote.memory.access !0
1472+
1473+
!0 = !{}
1474+
13971475

13981476
LLVM IR Attributes
13991477
==================

llvm/docs/ReleaseNotes.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -129,6 +129,8 @@ Changes to the AMDGPU Backend
129129
-----------------------------
130130

131131
* Implemented the ``llvm.get.fpenv`` and ``llvm.set.fpenv`` intrinsics.
132+
* Added ``!amdgpu.no.fine.grained.memory`` and
133+
``!amdgpu.no.remote.memory`` metadata to control atomic behavior.
132134

133135
* Implemented :ref:`llvm.get.rounding <int_get_rounding>` and :ref:`llvm.set.rounding <int_set_rounding>`
134136

0 commit comments

Comments
 (0)