Skip to content

gh-132917: Use /proc/self/status for mem usage info. #133544

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 8, 2025

Conversation

nascheme
Copy link
Member

@nascheme nascheme commented May 6, 2025

Using smaps_rollup is slower and we can get the similar info from /proc/self/status. We don't need the extra accuracy that smaps_rollup is giving.

Profiling is showing that reading the smaps_rollup file is taking on the order of 30 ms. Reading status is much faster. Some background detail on this difference:

https://gitlab.com/gitlab-com/gl-infra/production-engineering/-/issues/10966#note_410194443

Using smaps_rollup is quite a lot slower and we can get the similar info
from /proc/self/status.
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @nascheme for commit 5b3621d 🤖

Results will be shown at:

https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F133544%2Fmerge

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label May 6, 2025
@nascheme nascheme added performance Performance or resource usage topic-free-threading labels May 7, 2025
@nascheme
Copy link
Member Author

nascheme commented May 7, 2025

Some more precise run time comparison. With the script from GH-132917 running in 10 parallel threads, the /proc/self/smaps_rollup version takes 3000 us/per call and the /proc/self/status version uses 15 us/per call.

I also tested the MacOS version and it takes 4.5 us/per call. The full GC pass is taking roughly 70 ms.

@nascheme
Copy link
Member Author

nascheme commented May 7, 2025

Comparing the info from /proc/self/smaps_rollup vs /proc/self/status. This is on a Linux 6.1.0 kernel with 16 GB of RAM and 16 GB of swap. I ran a Python program that just allocates a bunch of memory in a loop while printing the proc info.

Rss: Swap: VmRSS: VmSwap:
2,361,876 0 2,361,684 0
4,714,092 0 4,713,924 0
7,066,308 0 7,066,164 0
9,418,588 0 9,418,404 0
11,770,800 0 11,770,644 0
14,122,420 0 14,122,288 0
14,846,552 1,625,612 14,845,604 1,625,612
15,439,252 3,384,872 15,478,708 3,345,056

Copy link
Member

@Yhg1s Yhg1s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a more efficient approach and the change is pretty simple... But unless @hugovk wants to reopen the release branch I think it should just go into beta 2. (Also I still have the same concerns I mentioned before, but they're not pressing enough to worry about for b1.)

@hugovk
Copy link
Member

hugovk commented May 7, 2025

Let's keep this for b2, thanks!

@nascheme nascheme enabled auto-merge (squash) May 7, 2025 16:53
@nascheme nascheme merged commit 751db4e into python:main May 8, 2025
38 checks passed
@nascheme nascheme added the needs backport to 3.14 bugs and security fixes label May 8, 2025
@miss-islington-app
Copy link

Thanks @nascheme for the PR 🌮🎉.. I'm working now to backport this PR to: 3.14.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request May 8, 2025
…133544)

On Linux, use /proc/self/status for mem usage info.  Using smaps_rollup is quite a lot slower and
we can get the similar info from /proc/self/status.
(cherry picked from commit 751db4e)

Co-authored-by: Neil Schemenauer <[email protected]>
@bedevere-app
Copy link

bedevere-app bot commented May 8, 2025

GH-133718 is a backport of this pull request to the 3.14 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.14 bugs and security fixes label May 8, 2025
nascheme added a commit that referenced this pull request May 8, 2025
… (gh-133718)

On Linux, use /proc/self/status for mem usage info.  Using smaps_rollup is quite a lot slower and
we can get the similar info from /proc/self/status.
(cherry picked from commit 751db4e)

Co-authored-by: Neil Schemenauer <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants