Description
I came across an unexpected performance degradation on Windows when using File::read_to_end(&mut self, buf: &mut Vec<u8>)
: the larger the passed in buf
Vec
's capacity, the longer the call takes—regardless of actually read bytes.
let mut buffer = Vec::with_capacity(BUFFER_SIZE);
let mut file = File::open("some_1kb_file.txt").expect("opening file");
let metadata = file.metadata().expect("reading metadata");
let len = metadata.len();
assert!(len == 1024);
file.read_to_end(&mut buffer).expect("reading file");
With the above code, increasing BUFFER_SIZE
will linearly increase the runtime of read_to_end
, even if we're always reading 1024
bytes.
This doesn't seem to happen on other OSes. At first I assumed I was doing something wrong, but with help from folks at StackOverflow we realized most of the time is spent in NtReadFile
. One can get around this by using file.read_exact(&mut buffer[0..len])
or file.take(len).read_to_end()
instead.
I don't know what the implication of querying for the file size and using it in read_to_end
would have, or if it'd be possible to get around this another way, but I assume at the very worst we could have a warning in the documentation about this.
Meta
rustc 1.68.2 (9eb3afe9e 2023-03-27)
binary: rustc
commit-hash: 9eb3afe9ebe9c7d2b84b71002d44f4a0edac95e0
commit-date: 2023-03-27
host: x86_64-pc-windows-msvc
release: 1.68.2
LLVM version: 15.0.6