@@ -1394,6 +1394,84 @@ arguments.
1394
1394
1395
1395
%val = load i32, ptr %in, align 4, !amdgpu.last.use !{}
1396
1396
1397
+ '``amdgpu.no.remote.memory``' Metadata
1398
+ ---------------------------------------------
1399
+
1400
+ Asserts a memory operation does not access bytes in host memory, or
1401
+ remote connected peer device memory (the address must be device
1402
+ local). This is intended for use with :ref:`atomicrmw <i_atomicrmw>`
1403
+ and other atomic instructions. This is required to emit a native
1404
+ hardware instruction for some :ref:`system scope
1405
+ <amdgpu-memory-scopes>` atomic operations on some subtargets. For most
1406
+ integer atomic operations, this is a sufficient restriction to emit a
1407
+ native atomic instruction.
1408
+
1409
+ An :ref:`atomicrmw <i_atomicrmw>` without metadata will be treated
1410
+ conservatively as required to preserve the operation behavior in all
1411
+ cases. This will typically be used in conjunction with
1412
+ :ref:`\!amdgpu.no.fine.grained.memory<amdgpu_no_fine_grained_memory>`.
1413
+
1414
+
1415
+ .. code-block:: llvm
1416
+
1417
+ ; Indicates the atomic does not access fine-grained memory, or
1418
+ ; remote device memory.
1419
+ %old0 = atomicrmw sub ptr %ptr0, i32 1 acquire, !amdgpu.no.fine.grained.memory !0, !amdgpu.no.remote.memory !0
1420
+
1421
+ ; Indicates the atomic does not access peer device memory.
1422
+ %old2 = atomicrmw sub ptr %ptr2, i32 1 acquire, !amdgpu.no.remote.memory !0
1423
+
1424
+ !0 = !{}
1425
+
1426
+ .. _amdgpu_no_fine_grained_memory:
1427
+
1428
+ '``amdgpu.no.fine.grained.memory``' Metadata
1429
+ -------------------------------------------------
1430
+
1431
+ Asserts a memory access does not access bytes allocated in
1432
+ fine-grained allocated memory. This is intended for use with
1433
+ :ref:`atomicrmw <i_atomicrmw>` and other atomic instructions. This is
1434
+ required to emit a native hardware instruction for some :ref:`system
1435
+ scope <amdgpu-memory-scopes>` atomic operations on some subtargets. An
1436
+ :ref:`atomicrmw <i_atomicrmw>` without metadata will be treated
1437
+ conservatively as required to preserve the operation behavior in all
1438
+ cases. This will typically be used in conjunction with
1439
+ :ref:`\!amdgpu.no.remote.memory.access<amdgpu_no_remote_memory_access>`.
1440
+
1441
+ .. code-block:: llvm
1442
+
1443
+ ; Indicates the access does not access fine-grained memory, or
1444
+ ; remote device memory.
1445
+ %old0 = atomicrmw sub ptr %ptr0, i32 1 acquire, !amdgpu.no.fine.grained.memory !0, !amdgpu.no.remote.memory.access !0
1446
+
1447
+ ; Indicates the access does not access fine-grained memory
1448
+ %old2 = atomicrmw sub ptr %ptr2, i32 1 acquire, !amdgpu.no.fine.grained.memory !0
1449
+
1450
+ !0 = !{}
1451
+
1452
+ .. _amdgpu_no_remote_memory_access:
1453
+
1454
+ '``amdgpu.ignore.denormal.mode``' Metadata
1455
+ ------------------------------------------
1456
+
1457
+ For use with :ref:`atomicrmw <i_atomicrmw>` floating-point
1458
+ operations. Indicates the handling of denormal inputs and results is
1459
+ insignificant and may be inconsistent with the expected floating-point
1460
+ mode. This is necessary to emit a native atomic instruction on some
1461
+ targets for some address spaces where float denormals are
1462
+ unconditionally flushed. This is typically used in conjunction with
1463
+ :ref:`\!amdgpu.no.remote.memory.access<amdgpu_no_remote_memory_access>`
1464
+ and
1465
+ :ref:`\!amdgpu.no.fine.grained.memory<amdgpu_no_fine_grained_memory>`
1466
+
1467
+
1468
+ .. code-block:: llvm
1469
+
1470
+ %res0 = atomicrmw fadd ptr addrspace(1) %ptr, float %value seq_cst, align 4, !amdgpu.ignore.denormal.mode !0
1471
+ %res1 = atomicrmw fadd ptr addrspace(1) %ptr, float %value seq_cst, align 4, !amdgpu.ignore.denormal.mode !0, !amdgpu.no.fine.grained.memory !0, !amdgpu.no.remote.memory.access !0
1472
+
1473
+ !0 = !{}
1474
+
1397
1475
1398
1476
LLVM IR Attributes
1399
1477
==================
0 commit comments