gh-128972: Add `_Py_ALIGN_AS` and revert `PyASCIIObject` memory layout. #133085

encukou · 2025-04-28T13:44:10Z

Add _Py_ALIGN_AS as per C API WG vote: capi-workgroup/decisions#61
This patch only adds it to free-threaded builds; the #ifdef Py_GIL_DISABLED should be removed in the future.

Use this to revert PyASCIIObject memory layout for non-free-threaded builds. The long-term plan is to deprecate the entire struct; until that happens it's better to keep it unchanged, as courtesy to people that rely on it despite it not being stable ABI.

Issue: Incompatible change in internal string representation (encountered with Rust bindings) (?) #128972

… layout. Add `_Py_ALIGN_AS` as per C API WG vote: capi-workgroup/decisions#61 This patch only adds it to free-threaded builds; the `#ifdef Py_GIL_DISABLED` can be removed in the future. Use this to revert `PyASCIIObject` memory layout for non-free-threaded builds. The long-term plan is to deprecate the entire struct; until that happens it's better to keep it unchanged, as courtesy to people that rely on it despite it not being stable ABI.

serhiy-storchaka · 2025-04-28T14:41:15Z

Include/cpython/unicodeobject.h

+    /* Ensure 4 byte alignment for PyUnicode_DATA(), see gh-63736 on m68k.
+       In the non-free-threaded build, we'll use explicit padding instead */
+   _Py_ALIGN_AS(4)


It does not ensure alignment for PyUnicode_DATA(), it ensures alignment for state.

To ensure alignment for PyUnicode_DATA(), you need to add _Py_ALIGN_AS before the data field in PyUnicodeObject.

Am I missing something? This sounds like a distinction without a difference.

The PyUnicode_DATA() macro depends on the size of PyASCIIObject.

The size is a multiple of the alignment.

The alignment of PyASCIIObject must be at least the alignment of every member, so it must be at least 4 bytes with the _Py_ALIGN_AS(4) macro.

The alignment of PyASCIIObject should be the same as the alignment of PyObject.

The definition of the _PyUnicode_COMPACT_DATA() macro and the code that calculates the size of the PyASCIIObject objects should be changed to guarantee the alignment of data.

Perhaps _Py_ALIGN_AS(4) should be added in PyObject_HEAD.

What exactly are you asking about?

Why should the alignment of PyASCIIObject be the same as the alignment of PyObject?
Why should the definition of _PyUnicode_COMPACT_DATA change?

Because the PyASCIIObject layout should be compatible with the PyObject layout. It is allocated by the same function as other objects. But we expect that all dynamically allocated data is already aligned to 8 or even 16 bytes, so perhaps there is nothing to do.

_PyUnicode_COMPACT_DATA() returns a pointer to the raw data. If we want to be sure that it is well aligned when the base address is well aligned, despite the size of PyASCIIObject and PyCompactUnicodeObject, we should just round up the offset. Something like:

if (PyUnicode_IS_ASCII(op)) { return ((char*)op) + _Py_SIZE_ROUND_UP(sizeof(PyASCIIObject), 4); } return ((char*)op) + _Py_SIZE_ROUND_UP(sizeof(PyCompactUnicodeObject), 4);

And similar in the allocator.

_Py_ALIGN_AS is not needed for this.

It is normal and fine for a struct that wraps another struct to have stricter alignment requirements. In other words, it's normal and fine for PyASCIIObject (and PyCompactUnicodeObject) to have stricter alignment requirements than PyObject.

But we expect that all dynamically allocated data is already aligned to 8 or even 16 bytes, so perhaps there is nothing to do.

Yes. If we needed >8 or >16 byte alignment then we'd need special handling for allocation, but we still wouldn't need to change PyObject.

If we want to be sure that it is well aligned when the base address is well aligned, despite the size of PyASCIIObject and PyCompactUnicodeObject, we should just round up the offset.

This requires changing every use of sizeof(PyASCIIObject) to _Py_SIZE_ROUND_UP(sizeof(PyASCIIObject), 4), including when we allocate it, as you say. This seems functionally equivalent to ensuring that the size is at least a multiple of 4 (by ensuring the alignment is at least a multiple of 4). I don't really have a strong preference either way: The _Py_SIZE_ROUND_UP() approach means we don't have to mess with _Py_ALIGN_AS(). The _Py_ALIGN_AS() means we don't have to worry about incorrectly using sizeof(PyASCIIObject).

_Py_ALIGN_AS will be useful elsewhere, so I'd use that.

serhiy-storchaka · 2025-04-28T14:47:14Z

Include/pymacro.h

+#           if (__cplusplus < 201103L) \
+                  && (defined(__GNUC__) || defined(__clang__))
+#               define _Py_ALIGN_AS(V) __attribute__((aligned(V)))
+#           else
+#               define _Py_ALIGN_AS(V) alignas(V)
+#           endif


This is difficult to comprehend. I would write:

Suggested change

# if (__cplusplus < 201103L) \

&& (defined(__GNUC__) || defined(__clang__))

# define _Py_ALIGN_AS(V) __attribute__((aligned(V)))

# else

# define _Py_ALIGN_AS(V) alignas(V)

# endif

# if __cplusplus >= 201103L

# define _Py_ALIGN_AS(V) alignas(V)

# elif defined(__GNUC__) || defined(__clang__)

# define _Py_ALIGN_AS(V) __attribute__((aligned(V)))

# else

# define _Py_ALIGN_AS(V) alignas(V)

# endif

And should we add a branch for pre-C++-11 MSVC?

serhiy-storchaka · 2025-04-28T14:50:15Z

Include/pymacro.h

+#       elif defined(__STDC_VERSION__) && __STDC_VERSION__ >= 201112L
+#           define _Py_ALIGN_AS(V) _Alignas(V)
+#       elif (defined(__GNUC__) || defined(__clang__)) \
+              && defined(__STDC_VERSION__) && __STDC_VERSION__ < 201112L


Is not this condition redundant? The case defined(__STDC_VERSION__) && __STDC_VERSION__ >= 201112L is covered above, so the only special case for which this condition would be needed is when __STDC_VERSION__ is not defined while __GNUC__ or __clang__ are defined. Is it even possible?

Good catch, thanks!

encukou requested review from colesbury and corona10 April 28, 2025 13:44

bedevere-app bot added the awaiting core review label Apr 28, 2025

bedevere-app bot mentioned this pull request Apr 28, 2025

Incompatible change in internal string representation (encountered with Rust bindings) (?) #128972

Open

corona10 requested a review from serhiy-storchaka April 28, 2025 14:24

serhiy-storchaka reviewed Apr 28, 2025

View reviewed changes

encukou added 2 commits April 29, 2025 13:41

Simplify the __cplusplus branch

9d4d011

Remove redundant __STDC_VERSION__ check

95ffa4c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-128972: Add `_Py_ALIGN_AS` and revert `PyASCIIObject` memory layout. #133085

gh-128972: Add `_Py_ALIGN_AS` and revert `PyASCIIObject` memory layout. #133085

encukou commented Apr 28, 2025 •

edited by bedevere-app bot

Loading

serhiy-storchaka Apr 28, 2025

colesbury Apr 28, 2025

serhiy-storchaka Apr 28, 2025

colesbury Apr 28, 2025

serhiy-storchaka Apr 28, 2025

colesbury Apr 28, 2025

serhiy-storchaka Apr 28, 2025

colesbury Apr 28, 2025

encukou Apr 29, 2025

serhiy-storchaka Apr 28, 2025

serhiy-storchaka Apr 28, 2025

encukou Apr 29, 2025

encukou Apr 29, 2025

gh-128972: Add _Py_ALIGN_AS and revert PyASCIIObject memory layout. #133085

Are you sure you want to change the base?

gh-128972: Add _Py_ALIGN_AS and revert PyASCIIObject memory layout. #133085

Conversation

encukou commented Apr 28, 2025 • edited by bedevere-app bot Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gh-128972: Add `_Py_ALIGN_AS` and revert `PyASCIIObject` memory layout. #133085

gh-128972: Add `_Py_ALIGN_AS` and revert `PyASCIIObject` memory layout. #133085

encukou commented Apr 28, 2025 •

edited by bedevere-app bot

Loading