Skip to content

branch details with gix: 2.6x faster! #4670

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Aug 21, 2024
Merged

Conversation

Byron
Copy link
Collaborator

@Byron Byron commented Aug 10, 2024

This PR is the first step towards providing branch details with gitoxide.

Tasks

  • git2 performance optimizations for 'apples-to-apples' comparison.
  • Fill in missing details in understanding
  • Add a couple of tests (validate merge-base and diff information)
  • Add benchmarks - ideally also with larger repositories
  • Use gix for commit-walk - more than twice as fast.
    Screenshot 2024-08-12 at 11 56 56, and it can certainly get even faster.
  • Use gix for tree-tree diff Screenshot 2024-08-13 at 10 24 34 - first results are promising, at 11 times faster on a single core, pure diffing performance. It will be slower if the diffed objects have to be retrieved/aren't cached, so in reality on gitlab it's just about 1.6x 1,85x of git2 (thanks to moderate multi-threading).
  • make it fast in the real world
  • improve tree-tree diff API
  • don't forget origin/478475-model-registry-markdown-support-for-version-description has 2 commit (itself) above the merge-base, but lists commit in thegix`
    • This is correct, actually, as there is only one commit and git2 for some reason counts the first parent, even though it's 'underneath' the mergebase.
  • Don't forget to re-enable all benchmarks!
  • the last performance improvement
    • 2.6x improvement overall!
    • With that, even once gix is faster at merge-bases, it will probably only make a difference in memory consumption, but not in performance.
  • Add support for CC based zlibng to flate2

Follow-Ups

Notes for the Reviewer

  • This PR also optimizes git2 for performance, which also levels the playing field as certain validations cost time, but aren't performed (yet) by gix.
  • gix is able to be 1.6x of git2 when it comes to tree-diffing, and it's noticeable. however, for now it's more like 1.5x as stock zlib has to be used until cmake is either available or flate2 can be configured to not use cmake at all.

Notes

  • When having a virtual branch named main it becomes 'entangled' with refs/heads/main. However, when refs/heads/main receives a new commit, main (the vbranch) knows nothing about it and also won't offer to rebase on top of it. It's still tied to origin/main. It's probably OK, even though I would have expected to be able to incorporate the changes somehow.
  • Eventually, we will want to make max-performance work on CI - shouldn't be too hard and would already work if docker wasn't used.
  • I have added sorting (lexicographically, increasing) to make both versions more comparable, and also because git branch sorts like that (but using the full ref name. In the end, the UI should probably sort and group like it wants, then the backend doesn't have to do that anymore. Then I realized the UI is already sorting by last-modified date.
  • For clean profiling, use gitbutler-cli branch details 478475-model-registry-markdown-support-for-version-description "pedropombeiro/454313/3-deprecate-fields" "mc/overwrite-jobs-instead-of-uniqueness-enforcement" "backfill-multiple-desired-sharding-key-small-table-compliance_framework_security_policies" "jay_mccure-master-patch-68663" "work-items-widgets-migration" "ah-dedup-records-migration" "455903-limits" "477889-implement-additional-telemetry-for-added-context-code-completion" "psi-ff-summduo" "450705-policies-mvp-combined-mode" "ia-arkose-token-verification-consitent-return" "hustewart-add-build-snippet-service" "bw-add-subscribed-filter" "adjust-new-user-signup-caps-validation" "bk/473748-backfill-owasp-top-10-null" "cwoolley-gitlab-master-patch-150b" "application-settings-pattern-poc-haml" "470051-document-backup-of-object-storage-data" "j_lar/add_redis_keys_to_monitor" "477306-glql-table-of-issues" "rlehmann1-fix-lint-issues-policies" "ban-ai-duo-settings-admin-group-form" "handle-empty-ff-merge-in-from-train-ref-strategy" "bw-run-markdown-benchmark-on-tag" "477813-follow-up-from-refactor-ai-ga-policies" "gitaly-ci-jobs-52550" "443369-update-self-managed-billing" "433082-iteration-rest-api-does-not-properly-filter-out-iterations-from-groups-with-no-access" "jkhabie-add-reachability-column-sbom" "jreporter-master-patch-d091" "vmairet-add-token-associations" "vij-remove-member-default-value" "autoflow/refactor-naming" "thutterer-danger-demo-settings" "443369-update-billed-shared-group-users" "fix_audit_event_flow" "16-11-stable-ee" "477712-health-check-repeated-click" "470701-branch-rule-editing-minimum-required-approvals-per-target-branch" "ensure_prepared_worker_automation" "janis-add-pages-to-usage-quotas" "quarantine-flaky-tests-spec-lib-gitlab-import_export-project-relation_factory_spec-rb-81" "440857-glql-integration" "472352-transition-dast-site-profiles-builds" "tachyons-pat-sharding-key-full" "368542-container-registry-metadata" "design-management-widget-migration" "ag/477091-seat-controls-fe-validation" "458831-remove-required_instance_ci_template-column" "455903-placeholder-limit" "gmh-add-approver-to-mr-graphql" "quarantine-flaky-tests-spec-lib-banzai-pipeline-plain_markdown_pipeline_spec-rb-106" "463258-create-common-solution-on-how-to-count-and-track-feature-flag-enabled-value" "mk/fix-empty-replication-details-view" "rodrigo/integrate-improved-user-mapping-in-direct-transfer" "16-9-stable-ee" "jmc-16.9-backport-shm-qa" "mattkasa/refactor-query-analyzers" "application-settings-pattern-poc" "464587-support-merge-requests-as-context-for-duo-chat-responses" "449139-fe-update-the-list-and-the-counter-after-inline-role-change-2" "jivanvl-update-permissions-container-expiration-policy" "hustewart-snippets-refactor" "477547_nullify_organization_id_migration" "slashmanov/add-release-filter-to-mr-list-app" "dagron1-vulnerability-resolution-supported-scanners" "464132-work-items-prevent-adding-more-than-10-items-to-hierarchy-widget-in-one-batch" "md-add-feature-flag-headers" "form-issue-jira-setting" "ph/mrDashboardNavCount" "mc_rocha-poc-ingest-license-sbom-security-policy""462398-refactor-pb-fe" "bk/473748-modify-finder" "471726-ignore-invalid-project-CI-with-pep-override-strategy" "autoflow/issue-created-event" "466718-add-api-endpoint-for-bulk-usage_data-sending" "vi-display-closed-status-for-branches" "docs/bprescott/20240618-dbmisc" "eduardosanz-master-patch-19280" "461761-refactor-clean-widgets-to-only-use-id-iid-and-fullpath-to-fetch-work-item-in-individual" "andrey-move-gdk-build" "reintroduce-create-jira-issue-form" "seed-placeholder-users-task" "ck3g-test-xray-job-do-not-merge" "463822-create-service-to-generate-metadata-for-group-instance-level"

stderr with cache-efficiency-debug enabled, showing decent object cache performance, but bad pack-cache performance. The object cache is the most important.

Copy link

vercel bot commented Aug 10, 2024

@Byron is attempting to deploy a commit to the GitButler Team on Vercel.

A member of the Team first needs to authorize it.

@github-actions github-actions bot added the rust Pull requests that update Rust code label Aug 10, 2024
Copy link
Collaborator Author

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@krlvi Maybe you could help answer a couple of probably minor questions - they will probably resolves themselves as well once tests are added.

@Byron Byron force-pushed the git2-to-gix branch 4 times, most recently from 735e76d to e70d7bb Compare August 11, 2024 15:21
@Byron Byron force-pushed the git2-to-gix branch 3 times, most recently from 1a49eda to 728960e Compare August 12, 2024 10:06
@Byron Byron force-pushed the git2-to-gix branch 8 times, most recently from b19bf43 to 9520bdb Compare August 14, 2024 09:32
Copy link

vercel bot commented Aug 14, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
gitbutler-web ✅ Ready (Inspect) Visit Preview 💬 Add feedback Aug 14, 2024 0:05am

@krlvi
Copy link
Member

krlvi commented Aug 15, 2024

the tree to tree diffing being 11 times faster is really impressive! looking forward to using that in other hot paths of the app :)

@Byron
Copy link
Collaborator Author

Byron commented Aug 16, 2024

the tree to tree diffing being 11 times faster is really impressive! looking forward to using that in other hot paths of the app :)

Unfortunately it's really only like that in the given microbenchmark, which just diffs a single-line changes of the same two blobs, 10k times. As gitoxide has a cache for this, it doesn't do much work after extracting both objects, whereas git2 doesn't seem to have this optimization (an optimization really coming from implementing rename-tracking which is disabled here).

In practice, on a real GitLab repo it still clocks in at 1.6x faster, which is something I can feel when using the UI.

This lists all branches that the user would see (and get details for).
@Byron Byron marked this pull request as ready for review August 21, 2024 18:33
@Byron Byron enabled auto-merge August 21, 2024 18:34
@Byron Byron changed the title branch details with gix branch details with gix: 2.6x faster! Aug 21, 2024
Copy link
Member

@krlvi krlvi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome! cant wait to try it

@Byron Byron merged commit be4ddb1 into gitbutlerapp:master Aug 21, 2024
16 of 17 checks passed
@Byron Byron deleted the git2-to-gix branch August 21, 2024 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rust Pull requests that update Rust code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants