Description
Feature or enhancement
Replace the current setprofile
mechanism with PEP 669 API for cProfile.
Pitch
It's faster.
Before:
Hanoi: 0.342869599997357, 1.4579414000036195, 4.252174587699983
fib: 0.15549560000363272, 0.7588815999988583, 4.88040561907301
Hanoi: 0.33156009999947855, 1.394542599999113, 4.2060024713507635
fib: 0.16397210000286577, 0.7469802000050549, 4.555532313070332
Hanoi: 0.3341937000004691, 1.394005099995411, 4.171248889471747
fib: 0.15704140000161715, 0.7399598000047263, 4.711877250184387
Hanoi: 0.33133889999589883, 1.3821628000005148, 4.171447421409387
fib: 0.15387790000386303, 0.754370700000436, 4.902397940064804
Hanoi: 0.3403002000050037, 1.3969628000049852, 4.105089564991276
fib: 0.15468130000226665, 0.761441399998148, 4.922646758121312
After:
Hanoi: 0.3318088000014541, 1.2147841000041808, 3.661096691826309
fib: 0.15522490000148537, 0.6360437999974238, 4.097562955372092
Hanoi: 0.33085879999998724, 1.1502422000048682, 3.476535005279934
fib: 0.15410060000431258, 0.6056611999956658, 3.9302974808580635
Hanoi: 0.35002540000277804, 1.145935599997756, 3.273864125256799
fib: 0.1540030999967712, 0.5974198000039905, 3.8792712615299036
Hanoi: 0.3338971999983187, 1.1459307999975863, 3.431986851052829
fib: 0.16891020000184653, 0.6197690999979386, 3.6692224625343126
Hanoi: 0.3318254999976489, 1.1875411000000895, 3.578812056362467
fib: 0.1544136999946204, 0.5971600999982911, 3.867274082669449
20%+ speed up for overhead.
I guess the incentive is not in doubt, but I did have some issues when I implemented it.
- There is a very slight backward compatibility issue, or it's more like a design decision. The profiler has to disable its own profiling now.
# This works on main, but I don't think that's intuitive.
pr = cProfile.Profile()
pr.enable()
pr = cProfile.Profile()
pr.disable()
We can make it work as before, I just think this is not the right way to do it with PEP 669. Because of this, I changed an old (15 years) test in test_cprofile
.
-
We need to get the actual c function from the descriptor, for which I used the code in the current legacy tracing. However,
_PyInstrumentation_MISSING
is not exposed so I had to store it locally (keep reading it fromsys
is a bit expensive). -
On that matter, are we going to expose some APIs on C? It would be nice if I don't have to get
sys.monitoring
and do stuff from there. We have some defined constants but some APIs could be handy. We may be able to reduce the overhead a bit if we have an interface likePyEval_SetProfile
.
Addendum:
Benchmark Code
import timeit
hanoi_setup = """
import cProfile
def test():
def TowerOfHanoi(n, source, destination, auxiliary):
if n == 1:
return
TowerOfHanoi(n - 1, source, auxiliary, destination)
TowerOfHanoi(n - 1, auxiliary, destination, source)
TowerOfHanoi(16, "A", "B", "C")
pr = cProfile.Profile()
"""
fib_setup = """
import cProfile
def test():
def fib(n):
if n <= 1:
return 1
return fib(n - 1) + fib(n - 2)
fib(21)
pr = cProfile.Profile()
"""
test_baseline = """
test()
"""
test_profile = """
pr.enable()
test()
pr.disable()
"""
baseline = timeit.timeit(test_baseline, setup=hanoi_setup, number=100)
profile = timeit.timeit(test_profile, setup=hanoi_setup, number=100)
print(f"Hanoi: {baseline}, {profile}, {profile / baseline}")
baseline = timeit.timeit(test_baseline, setup=fib_setup, number=100)
profile = timeit.timeit(test_profile, setup=fib_setup, number=100)
print(f"fib: {baseline}, {profile}, {profile / baseline}")