Skip to content

gh-128972: Add _Py_ALIGN_AS and revert PyASCIIObject memory layout. #133085

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

encukou
Copy link
Member

@encukou encukou commented Apr 28, 2025

Add _Py_ALIGN_AS as per C API WG vote: capi-workgroup/decisions#61
This patch only adds it to free-threaded builds; the #ifdef Py_GIL_DISABLED should be removed in the future.

Use this to revert PyASCIIObject memory layout for non-free-threaded builds. The long-term plan is to deprecate the entire struct; until that happens it's better to keep it unchanged, as courtesy to people that rely on it despite it not being stable ABI.

… layout.

Add `_Py_ALIGN_AS` as per C API WG vote: capi-workgroup/decisions#61
This patch only adds it to free-threaded builds; the `#ifdef Py_GIL_DISABLED`
can be removed in the future.

Use this to revert `PyASCIIObject` memory layout for non-free-threaded builds.
The long-term plan is to deprecate the entire struct; until that happens
it's better to keep it unchanged, as courtesy to people that rely on it despite
it not being stable ABI.
Comment on lines +103 to +105
/* Ensure 4 byte alignment for PyUnicode_DATA(), see gh-63736 on m68k.
In the non-free-threaded build, we'll use explicit padding instead */
_Py_ALIGN_AS(4)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not ensure alignment for PyUnicode_DATA(), it ensures alignment for state.

To ensure alignment for PyUnicode_DATA(), you need to add _Py_ALIGN_AS before the data field in PyUnicodeObject.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I missing something? This sounds like a distinction without a difference.

  • The PyUnicode_DATA() macro depends on the size of PyASCIIObject.
  • The size is a multiple of the alignment.
  • The alignment of PyASCIIObject must be at least the alignment of every member, so it must be at least 4 bytes with the _Py_ALIGN_AS(4) macro.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alignment of PyASCIIObject should be the same as the alignment of PyObject.

The definition of the _PyUnicode_COMPACT_DATA() macro and the code that calculates the size of the PyASCIIObject objects should be changed to guarantee the alignment of data.

Perhaps _Py_ALIGN_AS(4) should be added in PyObject_HEAD.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh? Why?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly are you asking about?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should the alignment of PyASCIIObject be the same as the alignment of PyObject?
Why should the definition of _PyUnicode_COMPACT_DATA change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the PyASCIIObject layout should be compatible with the PyObject layout. It is allocated by the same function as other objects. But we expect that all dynamically allocated data is already aligned to 8 or even 16 bytes, so perhaps there is nothing to do.

_PyUnicode_COMPACT_DATA() returns a pointer to the raw data. If we want to be sure that it is well aligned when the base address is well aligned, despite the size of PyASCIIObject and PyCompactUnicodeObject, we should just round up the offset. Something like:

    if (PyUnicode_IS_ASCII(op)) {
        return ((char*)op) + _Py_SIZE_ROUND_UP(sizeof(PyASCIIObject), 4);
    }
    return ((char*)op) + _Py_SIZE_ROUND_UP(sizeof(PyCompactUnicodeObject), 4);

And similar in the allocator.

_Py_ALIGN_AS is not needed for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is normal and fine for a struct that wraps another struct to have stricter alignment requirements. In other words, it's normal and fine for PyASCIIObject (and PyCompactUnicodeObject) to have stricter alignment requirements than PyObject.

But we expect that all dynamically allocated data is already aligned to 8 or even 16 bytes, so perhaps there is nothing to do.

Yes. If we needed >8 or >16 byte alignment then we'd need special handling for allocation, but we still wouldn't need to change PyObject.

If we want to be sure that it is well aligned when the base address is well aligned, despite the size of PyASCIIObject and PyCompactUnicodeObject, we should just round up the offset.

This requires changing every use of sizeof(PyASCIIObject) to _Py_SIZE_ROUND_UP(sizeof(PyASCIIObject), 4), including when we allocate it, as you say. This seems functionally equivalent to ensuring that the size is at least a multiple of 4 (by ensuring the alignment is at least a multiple of 4). I don't really have a strong preference either way: The _Py_SIZE_ROUND_UP() approach means we don't have to mess with _Py_ALIGN_AS(). The _Py_ALIGN_AS() means we don't have to worry about incorrectly using sizeof(PyASCIIObject).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_Py_ALIGN_AS will be useful elsewhere, so I'd use that.

Comment on lines 44 to 49
# if (__cplusplus < 201103L) \
&& (defined(__GNUC__) || defined(__clang__))
# define _Py_ALIGN_AS(V) __attribute__((aligned(V)))
# else
# define _Py_ALIGN_AS(V) alignas(V)
# endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is difficult to comprehend. I would write:

Suggested change
# if (__cplusplus < 201103L) \
&& (defined(__GNUC__) || defined(__clang__))
# define _Py_ALIGN_AS(V) __attribute__((aligned(V)))
# else
# define _Py_ALIGN_AS(V) alignas(V)
# endif
# if __cplusplus >= 201103L
# define _Py_ALIGN_AS(V) alignas(V)
# elif defined(__GNUC__) || defined(__clang__)
# define _Py_ALIGN_AS(V) __attribute__((aligned(V)))
# else
# define _Py_ALIGN_AS(V) alignas(V)
# endif

And should we add a branch for pre-C++-11 MSVC?

# elif defined(__STDC_VERSION__) && __STDC_VERSION__ >= 201112L
# define _Py_ALIGN_AS(V) _Alignas(V)
# elif (defined(__GNUC__) || defined(__clang__)) \
&& defined(__STDC_VERSION__) && __STDC_VERSION__ < 201112L
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is not this condition redundant? The case defined(__STDC_VERSION__) && __STDC_VERSION__ >= 201112L is covered above, so the only special case for which this condition would be needed is when __STDC_VERSION__ is not defined while __GNUC__ or __clang__ are defined. Is it even possible?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants