-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
PERF: Improve efficiency of BlockValuesRefs._clear_dead_references(...)
#59643
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
5ec8fad
to
06565aa
Compare
@@ -26,6 +30,10 @@ from pandas._libs.util cimport ( | |||
) | |||
|
|||
|
|||
cdef extern from "Python.h": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surprised that this also makes a difference - Cython doesn't translate that is None
call for a cdef object into the C-equivalent expression already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does! But PyWeakref_GetObject
returns a PyObject*
which is not comparable to a PyObject
(not-ponter) and None
is the latter type. This is the reason for this quite ugly hack. If you know a better way I would really like to hear about it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did Cython give you a compilation error? I would expect None
to be a pointer to a PyObject as well and for Cython to handle this within its native syntax.
Of course I am wrong a lot! So I might have the wrong mental model for what Cython is doing, but worth double checking this can't be expressed more naturally first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I don't want to go down a rabbit hole on this - just double check real quick. If you spend anything more than 2 minutes trying to find an answer, let's just roll with what we've got
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem at all. Cython gives this error when using None
Error compiling Cython file:
------------------------------------------------------------
...
# all these objects stay alive, e.g. df.items() for wide DataFrames
# see GH#55245 and GH#55008
if force or len(self.referenced_blocks) > self.clear_counter:
self.referenced_blocks = [
ref for ref in self.referenced_blocks
if PyWeakref_GetObject(ref) is not None
^
------------------------------------------------------------
pandas/_libs/internals.pyx:914:44: Invalid types for 'is_not' (PyObject *, Python object)
Thanks @Tolker-KU |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.Improves efficiency of
BlockValuesRefs._clear_dead_references(...)
by allowing for more efficient C code-gen as a function lookup and FastCall is eliminated from a tight loop.Before
After