Skip to content

fix!: consider non-files as pruned officially. #1730

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 15 additions & 3 deletions gitoxide-core/src/repository/clean.rs
Original file line number Diff line number Diff line change
Expand Up @@ -161,11 +161,16 @@ pub(crate) mod function {
if entry.disk_kind.is_none() {
entry.disk_kind = workdir
.join(gix::path::from_bstr(entry.rela_path.as_bstr()))
.metadata()
.symlink_metadata()
.ok()
.and_then(|e| gix::dir::entry::Kind::try_from_file_type(e.file_type()));
.map(|e| e.file_type().into());
}
let mut disk_kind = entry.disk_kind.expect("present if not pruned");
let Some(mut disk_kind) = entry.disk_kind else {
if debug {
writeln!(err, "DBG: ignoring unreadable entry at '{}' ", entry.rela_path).ok();
}
continue;
};
if !keep {
if debug {
writeln!(err, "DBG: prune '{}' as -x or -p is missing", entry.rela_path).ok();
Expand All @@ -183,6 +188,12 @@ pub(crate) mod function {
}

match disk_kind {
Kind::NonFile => {
if debug {
writeln!(err, "DBG: skipped non-file at '{}'", entry.rela_path).ok();
}
continue;
}
Kind::File | Kind::Symlink => {}
Kind::Directory => {
if !directories {
Expand Down Expand Up @@ -254,6 +265,7 @@ pub(crate) mod function {
"WOULD remove"
},
suffix = match disk_kind {
Kind::NonFile => unreachable!("always skipped earlier"),
Kind::Directory if entry.property == Some(gix::dir::entry::Property::EmptyDirectory) => {
" empty"
}
Expand Down
22 changes: 12 additions & 10 deletions gix-dir/src/entry.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
use crate::walk::ForDeletionMode;
use crate::{Entry, EntryRef};
use std::borrow::Cow;
use std::fs::FileType;

/// A way of attaching additional information to an [Entry] .
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Ord, PartialOrd)]
Expand All @@ -25,6 +26,10 @@ pub enum Property {
/// The kind of the entry, seated in their kinds available on disk.
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Ord, PartialOrd)]
pub enum Kind {
/// Something that is not a file, like a named pipe or character device.
///
/// These can only exist in the filesystem.
NonFile,
Comment on lines +29 to +32
Copy link
Member

@EliahKagan EliahKagan Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the best thing is to call filesystem entries of kinds that are not representable in Git repositories. In my experience, usually "file" is either construed broadly to include such entries well as directories and symbolic links, or construed narrowly to mean "regular file" and thus exclude directories and symbolic links.

I think it is unintuitive for regular files, directories, and symbolic links all not to be NonFiles, at the same time as FIFOs, sockets, character and block devices, and other more exotic kinds of filesystem entries are NonFiles (though at least one such kind of entry, a whiteout, is very intuitively a non-file).

I don't know if it's possible to name NonFile better--which is why I'm saying this instead of opening a PR to rename it. The documentation comment can probably be expanded to clarify fully what NonFile is everything other than, as well as to list the other common examples of NonFiles, and I do plan to open a PR for that. It may be that documentation in other places can be usefully expanded too; I'll look into that.

This would not be the first technical usage of "non-" with a non-literal meaning. For example, in git commit -am message, only commit is said to be a "non-option argument," even though the operand message is unambiguously an argument that is not an option argument. Sometimes this is hard to avoid, or maybe not worth avoiding.

However, this does seem to be an area where confusion is possible, or at least possible for me. For example, when I read the description and commit message in #1629, which quoted from a comment in an early version of Git a few weeks before Git began to track symbolic links, I took "non-regular files" to encompass symbolic links and was worried that the code there might wrongly remove support for them.

In fact, the code there deliberately preserved support for them (as did #1727, which also used that phrase). But the same version of the comment that Git developers used early on to describe a situation that encompassed the absence of symlink support was taken to encompass the presence of symlink support when cited.

Another possible source of confusion is that some kinds of NonFiles are distinguished from related entities by being files. For example, a FIFO (named pipe) differs from an anonymous pipe by being a file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time to make this point of ambiguity so clear.

I'd definitely welcome a PR which changes the variant name and/or improves the documentation - right now it's definitely very brief.

Other also comes to mind as catch-all for everything that isn't the other variants. Technically that's a good fit as the logic literally does it like this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened #1734 for this. For now, I have not attempted to rename NonFile.

/// The entry is a blob, executable or not.
File,
/// The entry is a symlink.
Expand Down Expand Up @@ -147,20 +152,17 @@ impl Entry {
}
}

impl Kind {
/// Try to represent the file type `t` as `Entry`, or return `None` if it cannot be represented.
///
/// The latter can happen if it's a `pipe` for instance.
pub fn try_from_file_type(t: std::fs::FileType) -> Option<Self> {
Some(if t.is_dir() {
impl From<std::fs::FileType> for Kind {
fn from(value: FileType) -> Self {
if value.is_dir() {
Kind::Directory
} else if t.is_symlink() {
} else if value.is_symlink() {
Kind::Symlink
} else if t.is_file() {
} else if value.is_file() {
Kind::File
} else {
return None;
})
Kind::NonFile
}
}
}

Expand Down
10 changes: 2 additions & 8 deletions gix-dir/src/walk/classify.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,7 @@ pub fn root(
let mut last_length = None;
let mut path_buf = worktree_root.to_owned();
// These initial values kick in if worktree_relative_root.is_empty();
let file_kind = path_buf
.symlink_metadata()
.ok()
.and_then(|m| entry::Kind::try_from_file_type(m.file_type()));
let file_kind = path_buf.symlink_metadata().ok().map(|m| m.file_type().into());
let mut out = path(&mut path_buf, buf, 0, file_kind, || None, options, ctx)?;
let worktree_root_is_repository = out
.disk_kind
Expand All @@ -35,10 +32,7 @@ pub fn root(
}
path_buf.push(component);
buf.extend_from_slice(gix_path::os_str_into_bstr(component.as_os_str()).expect("no illformed UTF8"));
let file_kind = path_buf
.symlink_metadata()
.ok()
.and_then(|m| entry::Kind::try_from_file_type(m.file_type()));
let file_kind = path_buf.symlink_metadata().ok().map(|m| m.file_type().into());

out = path(
&mut path_buf,
Expand Down
2 changes: 1 addition & 1 deletion gix-dir/src/walk/readdir.rs
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ pub(super) fn recursive(
current_bstr,
if prev_len == 0 { 0 } else { prev_len + 1 },
None,
|| entry.file_type().ok().and_then(entry::Kind::try_from_file_type),
|| entry.file_type().ok().map(Into::into),
opts,
ctx,
)?;
Expand Down
11 changes: 8 additions & 3 deletions gix-dir/tests/dir/walk.rs
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,11 @@ fn one_top_level_fifo() {
}
);

assert_eq!(entries, &[], "Non-files are simply pruned by default");
assert_eq!(
entries,
&[entry("top", Untracked, NonFile),],
"Non-files are like normal files, but with a different state"
);
}

#[test]
Expand Down Expand Up @@ -99,10 +103,11 @@ fn fifo_in_traversal() {
&[
entry_nokind(".git", Pruned).with_property(DotGit).with_match(Always),
entry("dir-with-file/nested-file", Untracked, File),
entry("dir/nested", Untracked, NonFile),
entry("file", Untracked, File),
entry("top", Untracked, NonFile),
],
"Non-files are not even pruned, they are ignored entirely.\
If one day this isn't what we want, we can create an own filetype for them"
"Non-files only differ by their disk-kind"
);
}

Expand Down
7 changes: 6 additions & 1 deletion gix-status/src/index_as_worktree/types.rs
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,12 @@ impl Outcome {
pub enum Change<T = (), U = ()> {
/// This corresponding file does not exist in the worktree anymore.
Removed,
/// The type of file changed compared to the worktree, i.e. a symlink s now a file.
/// The type of file changed compared to the worktree, i.e. a symlink is now a file, or a file was replaced with a named pipe.
///
/// ### Deviation
///
/// A change to a non-file is marked as `modification` in Git, but that's related to the content which we can't evaluate.
/// Hence, a type-change is considered more appropriate.
Type,
/// This worktree file was modified in some form, like a permission change or content change or both,
/// as compared to this entry.
Expand Down
23 changes: 17 additions & 6 deletions gix-status/src/index_as_worktree_with_renames/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,7 @@ pub(super) mod function {

let tracker = options
.rewrites
.map(gix_diff::rewrites::Tracker::<rewrite::ModificationOrDirwalkEntry<'index, T, U>>::new)
.map(gix_diff::rewrites::Tracker::<ModificationOrDirwalkEntry<'index, T, U>>::new)
.zip(filter);
let rewrite_outcome = match tracker {
Some((mut tracker, (mut filter, mut attrs))) => {
Expand All @@ -168,12 +168,12 @@ pub(super) mod function {
let (change, location) = match event {
Event::IndexEntry(record) => {
let location = Cow::Borrowed(record.relative_path);
(rewrite::ModificationOrDirwalkEntry::Modification(record), location)
(ModificationOrDirwalkEntry::Modification(record), location)
}
Event::DirEntry(entry, collapsed_directory_status) => {
let location = Cow::Owned(entry.rela_path.clone());
(
rewrite::ModificationOrDirwalkEntry::DirwalkEntry {
ModificationOrDirwalkEntry::DirwalkEntry {
id: rewrite::calculate_worktree_id(
options.object_hash,
worktree,
Expand Down Expand Up @@ -222,7 +222,7 @@ pub(super) mod function {
}
}
Some(src) => {
let rewrite::ModificationOrDirwalkEntry::DirwalkEntry {
let ModificationOrDirwalkEntry::DirwalkEntry {
id,
entry,
collapsed_directory_status,
Expand Down Expand Up @@ -387,8 +387,11 @@ pub(super) mod function {

impl<T, U> gix_dir::walk::Delegate for Delegate<'_, '_, T, U> {
fn emit(&mut self, entry: EntryRef<'_>, collapsed_directory_status: Option<Status>) -> Action {
let entry = entry.to_owned();
self.tx.send(Event::DirEntry(entry, collapsed_directory_status)).ok();
// Status never shows untracked non-files
if entry.disk_kind != Some(gix_dir::entry::Kind::NonFile) {
let entry = entry.to_owned();
self.tx.send(Event::DirEntry(entry, collapsed_directory_status)).ok();
}

if self.should_interrupt.load(Ordering::Relaxed) {
Action::Cancel
Expand Down Expand Up @@ -466,6 +469,10 @@ pub(super) mod function {
ModificationOrDirwalkEntry::Modification(c) => c.entry.mode.to_tree_entry_mode(),
ModificationOrDirwalkEntry::DirwalkEntry { entry, .. } => entry.disk_kind.map(|kind| {
match kind {
Kind::NonFile => {
// Trees are never tracked for rewrites, so we 'pretend'.
gix_object::tree::EntryKind::Tree
}
Kind::File => gix_object::tree::EntryKind::Blob,
Kind::Symlink => gix_object::tree::EntryKind::Link,
Kind::Repository | Kind::Directory => gix_object::tree::EntryKind::Tree,
Expand Down Expand Up @@ -500,6 +507,10 @@ pub(super) mod function {
};

Ok(match kind {
Kind::NonFile => {
// Go along with unreadable files, they are passed along without rename tracking.
return Ok(object_hash.null());
}
Kind::File => {
let platform = attrs
.at_entry(rela_path, None, objects)
Expand Down
4 changes: 2 additions & 2 deletions gix-status/tests/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ publish = false
rust-version = "1.65"

[[test]]
name = "worktree"
path = "worktree.rs"
name = "status"
path = "status/mod.rs"

[features]
gix-features-parallel = ["gix-features/parallel"]
Expand Down
1 change: 1 addition & 0 deletions gix-status/tests/fixtures/generated-archives/.gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
status_unchanged.tar
status_changed.tar
symlink_stack.tar
status_nonfile.tar
Copy link
Member

@EliahKagan EliahKagan Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could generated archives be committed for this?

I think all tests using fixtures this covers are run only only on Unix-like systems (so the Windows differences discussed in #1727 (review) would not be a problem).

With the tar command, a FIFO can be included in a .tar archive and is recreated when that archive is extracted. I'm not sure about other approaches to taring and untaring, but gix-testtools uses the tar crate. The tar crate seems to support FIFOs, which are listed here, whereas kinds of filesystem entries that don't tar, like sockets, are omitted from that enum.

(A test of the tar crate with FIFOs is run only on Linux, but I understand the comment there to mean that this is solely due to a CI limitation and not to any decreased ability to archive and extract them on other Unix-like systems.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, I ran into this locally and on MacOS it definitely didn't properly retain the pipe status of the non-files, or it couldn't recreate it. They ended up as normal files after extracting them from the tar archive.

Since the script doesn't do much, I think it's also fine to not have it cached.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a commit was pushed, that might still be accessible, that has the .tar file, then I could look at it to see if the FIFO didn't end up in it, or if the issue was extraction. (But if not, or if it's not convenient to find, then I could still possibly reproduce this, next time I have access to a macOS system.)

19 changes: 19 additions & 0 deletions gix-status/tests/fixtures/status_nonfile.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#!/usr/bin/env bash
set -eu -o pipefail

git init -q untracked
(cd untracked
touch file && git add file && git commit -m "just to get an index for the test-suite"

mkfifo pipe
git status
)

git init -q tracked-swapped
(cd tracked-swapped
touch file && git add file && git commit -m "it starts out as trackable file"

rm file && mkfifo file
git status
)

17 changes: 17 additions & 0 deletions gix-status/tests/status/index_as_worktree.rs
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,10 @@ fn fixture(name: &str, expected_status: &[Expectation<'_>]) -> Outcome {
fixture_filtered(name, &[], expected_status)
}

fn nonfile_fixture(name: &str, expected_status: &[Expectation<'_>]) -> Outcome {
fixture_filtered_detailed("status_nonfile", name, &[], expected_status, |_| {}, false)
}

fn fixture_with_index(
name: &str,
prepare_index: impl FnMut(&mut gix_index::State),
Expand Down Expand Up @@ -185,6 +189,19 @@ fn status_removed() -> EntryStatus {
Change::Removed.into()
}

#[test]
#[cfg(unix)]
fn nonfile_untracked_are_not_visible() {
// And generally, untracked aren't visible here.
nonfile_fixture("untracked", &[]);
}

#[test]
#[cfg(unix)]
fn tracked_changed_to_non_file() {
nonfile_fixture("tracked-swapped", &[(BStr::new(b"file"), 0, Change::Type.into())]);
}

#[test]
fn removed() {
let out = fixture(
Expand Down
Loading
Loading