Open
Description
We want shallow clones and this issue tracks what needs to be done to get there.
Prerequisite tasks for bare clones
- Parse refspecs #474
- Write V2 Index Files (with TREE and EOIE extensions) #473
- Remote abstraction #475
- don't forget about
url.base.insteadOf
and….pushInsteadOf
- don't forget about
- match refspecs for fetching
- fetch pack for update (
gix fetch
with fast-forward support #548) - bare
clone
forgit-repository
#551 - http transport configuration (i.e. proxy settings, timeouts, useragent) (configure http transport #586)
- get progress message by stable id
-
unbuffered progress messages- lines are buffered line by line, but that's it. Hence we receive everything in real-time already. - Advanced HTTP transport configuration #627
- support for classifying spurious errors in error return types
- auto-tags
- ditch
naive
negotiation in favor of properconsecutive
one (or else clones from some servers may fail) via integrategix-negotiate
#861
Follow-ups of ditch naive implementation
Most of these are optional, but represent opportunities to make gix
better, so shouldn't be lost.
- nicer API for gix-config overrides #883
- 64bit dates #892
- traverse-with-commitgraph #887
see if-commit_graph()
can return our own type connected toRepo
, or if the graph can be made to be more convenient to use withgix::Id
not really, but getting traversal with commitgraph support would be great. Probably it can simply be retro-fitted to the existing traversal. But then again, it would speed up generating ids, but most people using that kind of traversal would just want to access commits plainly, which forces loading them anyway. So it's probably OK to keep it as is.- retro-fitted commit-graph support, because it will be useful to some - visualize commit-graphs as SVG #893
-
gix corpus
MVP #897 (initial version with tracing) -
gix corpus
with a little more to do
Additional tasks
These are for correctness, but don't block cargo
integration as no cargo
tests depend on them.
- allow to downgrade connections like git does, should be no problem. Maybe find a way to let the user enforce protocol versions, let's see how git does it.
- make it possible to not send streaming posts - that is only needed for posting packs and some git servers can't handle 'chunked' encoding that results from it. Lastly,
git
itself usescontent-length
as the buffer is pre-created in memory. - additional HTTP configuration as per cargo configuration
- correctly re-obtain credential helper configuration for each URL (but don't rewrite, it's Remote's only)
- make pack tempfiles appear like they do in git to help with cleanup in case of SIGKILL.
- ability to turn off 'is currently checked out' sanity check to emulate
git fetch --update-head-ok
. Cargo passes it to the CLI and maybe it's something we will need too just to make its updates work.
Tasks for proper transport configuration
- try to implement complex
http.<url>.*
based option overrides
Tasks for shallow cloning
Research needed, but the libgit2 issue might be helpful for more hints.
Research
- a nice overview document
- packs are forced non-thin when
.git/shallow
is present (containing the commits that are the shallow boundary, present, but without parents) - shallow repositories can be cloned from and remotes send that information along, making the clone shallow, too.
Watch out
- Much of this work is happening in
git-repository
, which is tracked ingix
towards 1.0 #470 . - subsequent fetches must not accidentally change the depths of the repository, but only fetch what changed inbetween. See point 2 in this comment. Note that I believe that pathological CPU usage in shallow clones on the server has been fixed by now.
- Ed Page states that according to GitHub employees, shallow clones are only expensive if
depth > 1
or converting it back to having full history.