-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[IR] Introduce captures attribute #116990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
ec80b2a
158568c
7d5c93e
39a0f8f
32eade0
cc27832
0020f9d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1397,6 +1397,42 @@ Currently, only the following parameter attributes are defined: | |
function, returning a pointer to allocated storage disjoint from the | ||
storage for any other object accessible to the caller. | ||
|
||
``captures(...)`` | ||
This attributes restrict the ways in which the callee may capture the | ||
pointer. This is not a valid attribute for return values. This attribute | ||
applies only to the particular copy of the pointer passed in this argument. | ||
|
||
The arguments of ``captures`` is a list of captured pointer components, | ||
which may be ``none``, or a combination of: | ||
|
||
- ``address``: The integral address of the pointer. | ||
- ``address_is_null`` (subset of ``address``): Whether the address is null. | ||
- ``provenance``: The ability to access the pointer for both read and write | ||
after the function returns. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If I'm understanding correctly, it's not possible to have something like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, that's currently not supported. This is just to reduce complexity because I don't think it would buy us much in practice right now. But I could change this to be more There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I pushed a change to track locations separately, so |
||
- ``read_provenance`` (subset of ``provenance``): The ability to access the | ||
pointer only for reads after the function returns. | ||
|
||
Additionally, it is possible to specify that some components are only | ||
captured in certain locations. Currently only the return value (``ret``) | ||
and other (default) locations are supported. | ||
|
||
The `pointer capture section <pointercapture>` discusses these semantics | ||
in more detail. | ||
|
||
Some examples of how to use the attribute: | ||
|
||
- ``captures(none)``: Pointer not captured. | ||
- ``captures(address, provenance)``: Equivalent to omitting the attribute. | ||
- ``captures(address)``: Address may be captured, but not provenance. | ||
- ``captures(address_is_null)``: Only captures whether the address is null. | ||
- ``captures(address, read_provenance)``: Both address and provenance | ||
captured, but only for read-only access. | ||
- ``captures(ret: address, provenance)``: Pointer captured through return | ||
value only. | ||
- ``captures(address_is_null, ret: address, provenance)``: The whole pointer | ||
is captured through the return value, and additionally whether the pointer | ||
is null is captured in some other way. | ||
|
||
.. _nocapture: | ||
|
||
``nocapture`` | ||
|
@@ -3339,10 +3375,92 @@ Pointer Capture | |
--------------- | ||
|
||
Given a function call and a pointer that is passed as an argument or stored in | ||
the memory before the call, a pointer is *captured* by the call if it makes a | ||
copy of any part of the pointer that outlives the call. | ||
To be precise, a pointer is captured if one or more of the following conditions | ||
hold: | ||
memory before the call, the call may capture two components of the pointer: | ||
|
||
* The address of the pointer, which is its integral value. This also includes | ||
parts of the address or any information about the address, including the | ||
fact that it does not equal one specific value. We further distinguish | ||
whether only the fact that the address is/isn't null is captured. | ||
* The provenance of the pointer, which is the ability to perform memory | ||
accesses through the pointer, in the sense of the :ref:`pointer aliasing | ||
rules <pointeraliasing>`. We further distinguish whether only read acceses | ||
are allowed, or both reads and writes. | ||
|
||
For example, the following function captures the address of ``%a``, because | ||
it is compared to a pointer, leaking information about the identitiy of the | ||
pointer: | ||
|
||
.. code-block:: llvm | ||
|
||
@glb = global i8 0 | ||
|
||
define i1 @f(ptr %a) { | ||
%c = icmp eq ptr %a, @glb | ||
ret i1 %c | ||
} | ||
|
||
The function does not capture the provenance of the pointer, because the | ||
``icmp`` instruction only operates on the pointer address. The following | ||
function captures both the address and provenance of the pointer, as both | ||
may be read from ``@glb`` after the function returns: | ||
|
||
.. code-block:: llvm | ||
|
||
@glb = global ptr null | ||
|
||
define void @f(ptr %a) { | ||
store ptr %a, ptr @glb | ||
ret void | ||
} | ||
|
||
The following function captures *neither* the address nor the provenance of | ||
the pointer: | ||
|
||
.. code-block:: llvm | ||
|
||
define i32 @f(ptr %a) { | ||
%v = load i32, ptr %a | ||
ret i32 | ||
} | ||
|
||
While address capture includes uses of the address within the body of the | ||
function, provenance capture refers exclusively to the ability to perform | ||
accesses *after* the function returns. Memory accesses within the function | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this mean But I guess we avoid this by making sure to always use our local There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Currently LLVM considers ptrtoint to always capture provenance, but I do plan to introduce a variant of it that only captures the address. This is also needed to better model strict provenance in Rust, and should also be helpful to represent pointer differences (though those might benefit from a separate intrinsic, really). |
||
itself are not considered pointer captures. | ||
|
||
We can further say that the capture only occurs through a specific location. | ||
In the following example, the pointer (both address and provenance) is captured | ||
through the return value only: | ||
|
||
.. code-block:: llvm | ||
|
||
define ptr @f(ptr %a) { | ||
%gep = getelementptr i8, ptr %a, i64 4 | ||
ret ptr %gep | ||
} | ||
|
||
However, we always consider direct inspection of the pointer address | ||
(e.g. using ``ptrtoint``) to be location-independent. The following example | ||
is *not* considered a return-only capture, even though the ``ptrtoint`` | ||
ultimately only contribues to the return value: | ||
|
||
.. code-block:: llvm | ||
|
||
@lookup = constant [4 x i8] [i8 0, i8 1, i8 2, i8 3] | ||
|
||
define ptr @f(ptr %a) { | ||
%a.addr = ptrtoint ptr %a to i64 | ||
%mask = and i64 %a.addr, 3 | ||
%gep = getelementptr i8, ptr @lookup, i64 %mask | ||
ret ptr %gep | ||
} | ||
|
||
This definition is chosen to allow capture analysis to continue with the return | ||
value in the usual fashion. | ||
|
||
The following describes possible ways to capture a pointer in more detail, | ||
where unqualified uses of the word "capture" refer to capturing both address | ||
and provenance. | ||
|
||
1. The call stores any bit of the pointer carrying information into a place, | ||
and the stored bits can be read from the place by the caller after this call | ||
|
@@ -3381,30 +3499,30 @@ hold: | |
@lock = global i1 true | ||
|
||
define void @f(ptr %a) { | ||
store ptr %a, ptr* @glb | ||
store ptr %a, ptr @glb | ||
store atomic i1 false, ptr @lock release ; %a is captured because another thread can safely read @glb | ||
store ptr null, ptr @glb | ||
ret void | ||
} | ||
|
||
3. The call's behavior depends on any bit of the pointer carrying information. | ||
3. The call's behavior depends on any bit of the pointer carrying information | ||
(address capture only). | ||
|
||
.. code-block:: llvm | ||
|
||
@glb = global i8 0 | ||
|
||
define void @f(ptr %a) { | ||
%c = icmp eq ptr %a, @glb | ||
br i1 %c, label %BB_EXIT, label %BB_CONTINUE ; escapes %a | ||
br i1 %c, label %BB_EXIT, label %BB_CONTINUE ; captures address of %a only | ||
BB_EXIT: | ||
call void @exit() | ||
unreachable | ||
BB_CONTINUE: | ||
ret void | ||
} | ||
|
||
4. The pointer is used in a volatile access as its address. | ||
|
||
4. The pointer is used as the pointer operand of a volatile access. | ||
|
||
.. _volatile: | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -273,6 +273,107 @@ raw_ostream &operator<<(raw_ostream &OS, MemoryEffects RMRB); | |
// Legacy alias. | ||
using FunctionModRefBehavior = MemoryEffects; | ||
|
||
/// Components of the pointer that may be captured. | ||
enum class CaptureComponents : uint8_t { | ||
None = 0, | ||
AddressIsNull = (1 << 0), | ||
Address = (1 << 1) | AddressIsNull, | ||
ReadProvenance = (1 << 2), | ||
Provenance = (1 << 3) | ReadProvenance, | ||
All = Address | Provenance, | ||
LLVM_MARK_AS_BITMASK_ENUM(Provenance), | ||
}; | ||
|
||
inline bool capturesNothing(CaptureComponents CC) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure what the following commits look like but maybe it would it be better for IDE code completion to have a struct wrapper with these as member functions? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I haven't worked on the following commits yet, so I expect the API here to change a good bit once I get around to the inference + CaptureTracking changes. I could wrap this in a struct, but then I'd have to reimplement the functionality that BitMaskEnum provides :( |
||
return CC == CaptureComponents::None; | ||
} | ||
|
||
inline bool capturesAnything(CaptureComponents CC) { | ||
return CC != CaptureComponents::None; | ||
} | ||
|
||
inline bool capturesAddressIsNullOnly(CaptureComponents CC) { | ||
return (CC & CaptureComponents::Address) == CaptureComponents::AddressIsNull; | ||
} | ||
|
||
inline bool capturesAddress(CaptureComponents CC) { | ||
return (CC & CaptureComponents::Address) != CaptureComponents::None; | ||
} | ||
|
||
inline bool capturesReadProvenanceOnly(CaptureComponents CC) { | ||
return (CC & CaptureComponents::Provenance) == | ||
CaptureComponents::ReadProvenance; | ||
} | ||
|
||
inline bool capturesFullProvenance(CaptureComponents CC) { | ||
return (CC & CaptureComponents::Provenance) == CaptureComponents::Provenance; | ||
} | ||
|
||
raw_ostream &operator<<(raw_ostream &OS, CaptureComponents CC); | ||
|
||
/// Represents which components of the pointer may be captured in which | ||
/// location. This represents the captures(...) attribute in IR. | ||
/// | ||
/// For more information on the precise semantics see LangRef. | ||
class CaptureInfo { | ||
CaptureComponents OtherComponents; | ||
CaptureComponents RetComponents; | ||
|
||
public: | ||
CaptureInfo(CaptureComponents OtherComponents, | ||
CaptureComponents RetComponents) | ||
: OtherComponents(OtherComponents), RetComponents(RetComponents) {} | ||
|
||
CaptureInfo(CaptureComponents Components) | ||
: OtherComponents(Components), RetComponents(Components) {} | ||
|
||
/// Create CaptureInfo that may capture all components of the pointer. | ||
static CaptureInfo all() { return CaptureInfo(CaptureComponents::All); } | ||
|
||
/// Get components potentially captured by the return value. | ||
CaptureComponents getRetComponents() const { return RetComponents; } | ||
|
||
/// Get components potentially captured through locations other than the | ||
/// return value. | ||
CaptureComponents getOtherComponents() const { return OtherComponents; } | ||
|
||
/// Get the potentially captured components of the pointer (regardless of | ||
/// location). | ||
operator CaptureComponents() const { return OtherComponents | RetComponents; } | ||
|
||
bool operator==(CaptureInfo Other) const { | ||
return OtherComponents == Other.OtherComponents && | ||
RetComponents == Other.RetComponents; | ||
} | ||
|
||
bool operator!=(CaptureInfo Other) const { return !(*this == Other); } | ||
|
||
/// Compute union of CaptureInfos. | ||
CaptureInfo operator|(CaptureInfo Other) const { | ||
return CaptureInfo(OtherComponents | Other.OtherComponents, | ||
RetComponents | Other.RetComponents); | ||
} | ||
|
||
/// Compute intersection of CaptureInfos. | ||
CaptureInfo operator&(CaptureInfo Other) const { | ||
return CaptureInfo(OtherComponents & Other.OtherComponents, | ||
RetComponents & Other.RetComponents); | ||
} | ||
|
||
static CaptureInfo createFromIntValue(uint32_t Data) { | ||
return CaptureInfo(CaptureComponents(Data >> 4), | ||
CaptureComponents(Data & 0xf)); | ||
} | ||
|
||
/// Convert CaptureInfo into an encoded integer value (used by captures | ||
/// attribute). | ||
uint32_t toIntValue() const { | ||
return (uint32_t(OtherComponents) << 4) | uint32_t(RetComponents); | ||
} | ||
}; | ||
|
||
raw_ostream &operator<<(raw_ostream &OS, CaptureInfo Info); | ||
|
||
} // namespace llvm | ||
|
||
#endif |
Uh oh!
There was an error while loading. Please reload this page.