Description
Feature or enhancement
Proposal:
I want to refactor the different codecs handlers in Python/codecs.c
to use _PyUnicodeError_GetParams
. Some codecs handlers will be refactored as part of #126004 but some others are not subject to issues (namely, the ignore
, namereplace
, surrogateescape
, and surrogatepass
handlers do not suffer from crashes, or at least I wasn't able to make them crash easily).
In addition, I also plan to split the handlers into functions instead of 2 or 3 big blocks of code handling a specific exception. For that reason, I will introduce the following helper macros:
#define _PyIsUnicodeEncodeError(EXC) \
PyObject_TypeCheck(EXC, (PyTypeObject *)PyExc_UnicodeEncodeError)
#define _PyIsUnicodeDecodeError(EXC) \
PyObject_TypeCheck(EXC, (PyTypeObject *)PyExc_UnicodeDecodeError)
#define _PyIsUnicodeTranslateError(EXC) \
PyObject_TypeCheck(EXC, (PyTypeObject *)PyExc_UnicodeTranslateError)
For handlers that need to be fixed, I will first fix them in-place (no refactorization). Afterwards, I will refactor them and extract the relevant part of the code into functions. That way, the diff will be easier to follow (I've observed that it's much harder to read the diff where I did both so I will revert that part in the existing PRs; EDIT: actually there is no PR doing both fixes and split...).
I'm creating this issue to track the progression of the refactorization if no issue occurs.
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response
Linked PRs
- gh-129173: Use
_PyUnicodeError_GetParams
inPyCodec_IgnoreErrors
#129174 - gh-129173: Use
_PyUnicodeError_GetParams
inPyCodec_NameReplaceErrors
#129135 - gh-129173: Use
_PyUnicodeError_GetParams
inPyCodec_SurrogatePassErrors
#129134 - gh-129173: Use
_PyUnicodeError_GetParams
inPyCodec_SurrogateEscapeErrors
#129175 - gh-129173: refactor
PyCodec_ReplaceErrors
into separate functions #129893 - gh-129173: simplify
PyCodec_XMLCharRefReplaceErrors
logic #129894 - gh-129173: refactor
PyCodec_BackslashReplaceErrors
into separate functions #129895
Metadata
Metadata
Assignees
Projects
Status