Skip to content

[commitgraph] implement basic, low-level read API #21

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 25 commits into from
Oct 1, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
d1f0e9c
[commitgraph] implement basic, low-level read API
avoidscorn Sep 15, 2020
36953e0
[commitgraph] Take `info` dir as arg, not `objects` dir.
avoidscorn Sep 17, 2020
3c92761
[commitgraph] Remove `Kind` enum.
avoidscorn Sep 17, 2020
724f391
[commitgraph] Include in `make check` target.
avoidscorn Sep 17, 2020
1ce8468
[commitgraph] Ditch pre-generated test repos.
avoidscorn Sep 17, 2020
b59bd5e
Merge from main.
avoidscorn Sep 17, 2020
7c405ab
[commitgraph] Don't re-export graph_file symbols at crate level.
avoidscorn Sep 18, 2020
d8c2007
[commitgraph] Rename CommitData -> Commit.
avoidscorn Sep 26, 2020
f451822
[commitgraph] Rename GraphFile -> File.
avoidscorn Sep 26, 2020
66588f2
[commitgraph] Remove unused error variant.
avoidscorn Sep 26, 2020
6cf5cd8
[commitgraph] Add some doc comments.
avoidscorn Sep 26, 2020
000748c
[commitgraph] Include Conor in crate manifest.
avoidscorn Sep 26, 2020
be0e845
[commitgraph] Don't export Commit symbol at crate level.
avoidscorn Sep 26, 2020
185d14b
[commitgraph] Rearrange some `use` statements.
avoidscorn Sep 26, 2020
21e4527
[commitgraph] Use crate::graph::Graph instead of crate::Graph.
avoidscorn Sep 26, 2020
5e78213
[commitgraph] Attempt to fix bash script execution on Windows.
avoidscorn Sep 26, 2020
ca5b801
Merge branch 'main' into commit-graph
Byron Sep 28, 2020
9ae1f4b
[commitgraph] Assure git doesn't try to sign commits when fixtures ar…
Byron Oct 1, 2020
7026961
[commitgraph] refactor
Byron Oct 1, 2020
2ed0037
[commitgraph] refactor
Byron Oct 1, 2020
3c8640e
[commitgraph] refactor Graph, Position, and access module
Byron Oct 1, 2020
d2eec1d
[commitgraph] refactor graph::init module
Byron Oct 1, 2020
6f90bee
[commitgraph] Rename LexPosition to 'file::Position'
Byron Oct 1, 2020
c4b14c1
[commitgraph] refactor
Byron Oct 1, 2020
8b003a0
[commitgraph] refactor file::init
Byron Oct 1, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,8 @@ check: ## Build all code in suitable configurations
&& cargo check --features fast-sha1 \
&& cargo check --features interrupt-handler \
&& cargo check --features disable-interrupts
cd git-commitgraph && cargo check --all-features \
&& cargo check

unit-tests: ## run all unit tests
cargo test --all --no-fail-fast
Expand Down
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -197,7 +197,9 @@ become available.
* [ ] API documentation with examples

### git-commitgraph
* Access to all capabilities provided by the file format, as well as their maintenance
* [x] read-only access
* [x] Graph lookup of commit information to obtain timestamps, generation and parents
* [ ] create and update graphs and graph files
* [ ] API documentation with examples

### git-config
Expand Down
15 changes: 12 additions & 3 deletions git-commitgraph/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
[package]
name = "git-commitgraph"
version = "0.0.0"
repository = "https://github.com/Byron/git-oxide"
repository = "https://github.com/Byron/gitxoxide"
documentation = "https://git-scm.com/docs/commit-graph#:~:text=The%20commit-graph%20file%20is%20a%20supplemental%20data%20structure,or%20in%20the%20info%20directory%20of%20an%20alternate."
license = "MIT/Apache-2.0"
description = "A WIP crate of the gitoxide project dedicated implementing the git commitgraph file format and its maintenance"
authors = ["Sebastian Thiel <[email protected]>"]
description = "A crate of the gitoxide project dedicated implementing the git commitgraph file format and its maintenance"
authors = ["Conor Davis <[email protected]>", "Sebastian Thiel <[email protected]>"]
edition = "2018"

[lib]
Expand All @@ -14,3 +14,12 @@ doctest = false
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
git-object = { version = "^0.4.0", path = "../git-object" }

bstr = { version = "0.2.13", default-features = false, features = ["std"] }
byteorder = "1.2.3"
filebuffer = "0.4.0"
quick-error = "2.0.0"

[dev-dependencies]
tempfile = "3.1.0"
126 changes: 126 additions & 0 deletions git-commitgraph/src/file/access.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
use crate::file::{self, commit::Commit, File, COMMIT_DATA_ENTRY_SIZE};
use git_object::{borrowed, HashKind, SHA1_SIZE};
use std::{
convert::{TryFrom, TryInto},
fmt::{Debug, Formatter},
path::Path,
};

/// Access
impl File {
/// Returns the commit data for the commit located at the given lexigraphical position.
///
/// `pos` must range from 0 to self.num_commits().
///
/// # Panics
///
/// Panics if `pos` is out of bounds.
pub fn commit_at(&self, pos: file::Position) -> Commit<'_> {
Commit::new(self, pos)
}

pub fn hash_kind(&self) -> HashKind {
HashKind::Sha1
}

/// Returns 20 bytes sha1 at the given index in our list of (sorted) sha1 hashes.
/// The position ranges from 0 to self.num_commits()
// copied from git-odb/src/pack/index/access.rs
pub fn id_at(&self, pos: file::Position) -> borrowed::Id<'_> {
assert!(
pos.0 < self.num_commits(),
"expected lexigraphical position less than {}, got {}",
self.num_commits(),
pos.0
);
let pos: usize = pos
.0
.try_into()
.expect("an architecture able to hold 32 bits of integer");
let start = self.oid_lookup_offset + (pos * SHA1_SIZE);
borrowed::Id::try_from(&self.data[start..start + SHA1_SIZE]).expect("20 bytes SHA1 to be alright")
}

pub fn iter_base_graph_ids(&self) -> impl Iterator<Item = borrowed::Id<'_>> {
let base_graphs_list = match self.base_graphs_list_offset {
Some(v) => &self.data[v..v + (SHA1_SIZE * self.base_graph_count as usize)],
None => &[],
};
base_graphs_list
.chunks_exact(SHA1_SIZE)
.map(|bytes| borrowed::Id::try_from(bytes).expect("20 bytes SHA1 to be alright"))
}

pub fn iter_commits(&self) -> impl Iterator<Item = Commit<'_>> {
(0..self.num_commits()).map(move |i| self.commit_at(file::Position(i)))
}

pub fn iter_ids(&self) -> impl Iterator<Item = borrowed::Id<'_>> {
(0..self.num_commits()).map(move |i| self.id_at(file::Position(i)))
}

// copied from git-odb/src/pack/index/access.rs
pub fn lookup(&self, id: borrowed::Id<'_>) -> Option<file::Position> {
let first_byte = id.first_byte() as usize;
let mut upper_bound = self.fan[first_byte];
let mut lower_bound = if first_byte != 0 { self.fan[first_byte - 1] } else { 0 };

// Bisect using indices
// TODO: Performance of V2 could possibly be better if we would be able to do a binary search
// on 20 byte chunks directly, but doing so requires transmuting and that is unsafe, even though
// it should not be if the bytes match up and the type has no destructor.
while lower_bound < upper_bound {
let mid = (lower_bound + upper_bound) / 2;
let mid_sha = self.id_at(file::Position(mid));

use std::cmp::Ordering::*;
match id.cmp(&mid_sha) {
Less => upper_bound = mid,
Equal => return Some(file::Position(mid)),
Greater => lower_bound = mid + 1,
}
}
None
}

/// Returns the number of commits in this graph file.
///
/// The maximum valid `file::Position` that can be used with this file is one less than
/// `num_commits()`.
pub fn num_commits(&self) -> u32 {
self.fan[255]
}

pub fn path(&self) -> &Path {
&self.path
}
}

impl File {
/// Returns the byte slice for the given commit in this file's Commit Data (CDAT) chunk.
pub(crate) fn commit_data_bytes(&self, pos: file::Position) -> &[u8] {
assert!(
pos.0 < self.num_commits(),
"expected lexigraphical position less than {}, got {}",
self.num_commits(),
pos.0
);
let pos: usize = pos
.0
.try_into()
.expect("an architecture able to hold 32 bits of integer");
let start = self.commit_data_offset + (pos * COMMIT_DATA_ENTRY_SIZE);
&self.data[start..start + COMMIT_DATA_ENTRY_SIZE]
}

/// Returns the byte slice for this file's entire Extra Edge List (EDGE) chunk.
pub(crate) fn extra_edges_data(&self) -> Option<&[u8]> {
Some(&self.data[self.extra_edges_list_range.clone()?])
}
}

impl Debug for File {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
write!(f, r#"File("{:?}")"#, self.path.display())
}
}
Loading