Skip to content

fix(next_by_topology): Correct the commit iteration order #853

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

kalkin
Copy link
Contributor

@kalkin kalkin commented May 14, 2023

Iterating over gix::revision::walk::Platform::all() returns commits in
the wrong order.

The expected commit output order is the same as git log --graph

*   0a534f214a
|\
| * ebcd820b65
| * b56e33210a
| * 46ff9bbc93
| * 6db404e52d
|/
*   f79134e683
|\
| * 29daaad4ad
|/
* dbeb4b38c9
|\
| * 2398fb5d9f
| * c550167bec
| * 8d6c064c26

Similar iteration using gix::traverse::Platform without this patch

* 0a534f214a
* f79134e683
| * ebcd820b65
* dbeb4b38c9
| * 29daaad4ad
| * b56e33210a
* 163f943697
| * 2398fb5d9f
| * 46ff9bbc93
* 00f8b9701e
* cca06c8bd7
| * c550167bec
| * 6db404e52d
| * 8d6c064c26

Iterating over `gix::revision::walk::Platform::all()` returns commits in
the wrong order.

The expected commit output order is the same as `git log --graph`

```
*   0a534f214a
|\
| * ebcd820b65
| * b56e33210a
| * 46ff9bbc93
| * 6db404e52d
|/
*   f79134e683
|\
| * 29daaad4ad
|/
* dbeb4b38c9
|\
| * 2398fb5d9f
| * c550167bec
| * 8d6c064c26
```

Similar iteration using `gix::traverse::Platform` without this patch

```
*   0a534f214a
*   f79134e683
| * ebcd820b65
* dbeb4b38c9
| * 29daaad4ad
| * b56e33210a
* 163f943697
| * 2398fb5d9f
| * 46ff9bbc93
* 00f8b9701e
* cca06c8bd7
| * c550167bec
| * 6db404e52d
| * 8d6c064c26
```
@kalkin
Copy link
Contributor Author

kalkin commented May 14, 2023

Hmm, i run locally only the tests in gix-traverse directory. It seems like you have tests which assure the iteration order. This confuses me. Did i misunderstood how Platform should iterate over commits, or are the tests wrong?

Copy link
Member

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for your help.

There are tests to validate iteration order and these are usually compared to git at least by hand. If something isn't consistent, I believe you could validate the existing iteration order by comparing what's expected against what git produces, and make adjustments if there is a mismatch.

If you are encountering a case where the order doesn't match, you could try to reproduce the issue in a test and then fix it.

@Byron
Copy link
Member

Byron commented Jun 2, 2023

If there is no time to tackle this, I'd prefer if you could close it and reopen once that changes. Thanks for your consideration.

@kalkin
Copy link
Contributor Author

kalkin commented Jun 2, 2023

Sorry, I pushed this to the end of the workqueue. I will try to find time this weekend, if you don't see any changes here on Monday, feel free to close it.

@Byron
Copy link
Member

Byron commented Jun 5, 2023

Closing as per our agreement, thanks for having let me know.

@Byron Byron closed this Jun 5, 2023
@kalkin
Copy link
Contributor Author

kalkin commented Jun 5, 2023

I looked yesterday into it, and I'm not sure if the order of ids in your tests are wrong or i have some error in my thinking.

I looked into the test graph_sorted_commits. I checked make_traversal_repo_for_commits_with_dates.tar.xz with git log and tig, both show a different order then the one you hardcoded in the tests.

I could just fix all the tests, but I want to be sure I don't have an error in my thinking.

@Byron
Copy link
Member

Byron commented Jun 5, 2023

Despite the name graph_sorted_commits which might imply some ordering, the test validates the natural ordering of commits which is by topology. tig and git log sort commits by date, which would explain the different order.

Maybe a PR would be to rename the function to something that indicates which ordering is used?

@kalkin
Copy link
Contributor Author

kalkin commented Jun 5, 2023

tig and git log sort commits by date, which would explain the different order.

The dates of the commits are all the same. (Btw the next tests e.g. committer_date_sorted_commits look strange to me, because the fixture creation shell script does GIT_COMMITTER_DATE="2000-01-02 00:00:00 +0000" on every commit)

Screenshot of three invocations of tig:

  1. just tig
  2. tig --topo-order
  3. tig --date-order

Screenshot_2023-06-05_11-11-49

All three outputs are identical, but also differ from hardcoded test values. I'm confused.

EDIT: It makes sense that they are all identical, because the commit dates are all the same...

@Byron
Copy link
Member

Byron commented Jun 6, 2023

Even though I think that the implementation is correct, there definitely seems to be some digging to be done, probably along with improving the tests to provide slightly different times for each commit (there is a couple of tests doing this).

After having worked on gix-revision it also feels that what's really missing here is gix_revision::Walk, with similar features to what's already in gix-traverse, but with support for the commit-graph acceleration data-structure. Maybe something like Walk could even supersede what's currently in gix-traverse and while at it, use better tests (or at least revise the existing ones).

For now I don't have a strong incentive to work on it, but next time a feature relies on revision walking, I will definitely take a look as well.

@kalkin
Copy link
Contributor Author

kalkin commented Jun 6, 2023

Even though I think that the implementation is correct, there definitely seems to be some digging to be done, ...

For my current use case I will be sorting by date to achieve the results I expect, so I can abandon my patched fork. Thank you for your help.

@Byron
Copy link
Member

Byron commented Jun 8, 2023

I am taking a closer look at this now and upon inspection I think there is a lot off about the expectations, especially because they don't have a baseline at all.

When done I'd expect it to be aligned with git log.

@Byron Byron mentioned this pull request Jun 8, 2023
5 tasks
@Byron
Copy link
Member

Byron commented Jun 8, 2023

I have cleaned up the tests and added new ones and definitely confirm that topo-order is very different from what git does. The main difference seems to be that git somehow keeps commits together, whereas gix-traverse just provides them breadth-first, from first to last parent, which would look ludicrous when shown as graph view.

For now I didn't get to investigate how exactly git does its ordering, but I will eventually to see if it can be implemented without also querying the commit time.

The idea is the gixprovides the fastest possible traversal which means it decodes as little as possible of each commit and does the least amount of work, and that is topo-order as it is implemented now, so it's more likely that another sort-mode is added to be the same as git-topo-order, rather than changing the existing one

Let's see.

@Byron
Copy link
Member

Byron commented Jun 8, 2023

I solved the problem by improving the documentation and renaming Sorting::Topological to Sorting::BreadthFirst, differentiating the two.

It does turn out that what git does is way more advanced than our simple and fast breadth-first order, and for now this isn't implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants