Description
Proposal
Problem statement
std::io::copy
provides a way to do a complete copy, but blocks indefinitely until it's completed. There's no way to portably do partial copies within the standard library, and it's not extensible to other types.
Motivation, use-cases
std::io::copy
, is very useful for quick and dirty copying from a readable to a writable. However, it does have some serious limitations:
- You can intercept cases where writes would fail, but signal interrupts are ignored. This is specifically bad when using file descriptor polling outside async runtimes, because it breaks signal handling.
- The current source code is hard-coded to optimize only for native file handles and (as of RFC 2930)
BufRead
. It's not extensible to other types, and more importantly, userland types can't optimize for each other, either. - As it loops indefinitely until completed, it cannot be used in async runtimes.
There's also a number of outstanding performance gaps:
- Copying from a
Vec<u8>
orVecDeque<u8>
to a file or socket currently involves an intermediate copy, one that's very easily avoidable. VecDeque<u8>
itself only tries to copy one half at a time, when in many cases it could just use a single vectored write. This would make it a lot more viable as a general byte buffer when paired with a&[u8]
reader or&mut [u8]
writer.
The RFC draft linked below goes into detail about potential use cases, including some sample code. It's admittedly a bit complex of a proposal.
Solution sketches
Option 1: a Splicer
trait. This is the primary one due to concerns around whether adding the needed "type" field to std::fs::File
(and possibly std::net::TcpStream
) is possible, or if too many packages break due to someone transmuting between those and raw file descriptors somewhere in the dependency chain.
// Exported from `std::io`
pub trait Splicer<'a, R, W> {
fn new(reader: &'a mut R, writer: &'a mut W) -> io::Result<Self>;
fn splice(&mut self) -> io::Result<usize>;
fn reader(&mut self) -> &'a mut R;
fn writer(&mut self) -> &'a mut W;
}
pub trait Read {
// ...
type Splicer<'a, W: io::Write>: Splicer<'a, Self, W> = DefaultSplicer<'a, Self, W>;
fn splicer_to<'a, W: io::Write>(
&'a mut self,
dest: &'a mut W
) -> io::Result<Self::Splicer<'a, W>> {
Self::Splicer::new(self, dest)
}
fn can_efficiently_splice_to(&self) -> bool;
}
pub trait Write {
// ...
type Splicer<'a, R: io::Read>: Splicer<'a, R, Self> = R::Splicer<'a>;
fn splicer_from<'a, R: io::Read>(
&'a mut self,
src: &'a mut R
) -> io::Result<Self::Splicer<'a, R>> {
Self::Splicer::new(self, dest)
}
fn can_efficiently_splice_from(&self) -> bool;
}
pub struct DefaultSplicer<'a, R, W> { ... }
impl<'a, R: io::Read, W: io::Write> Splicer<'a, R, W> for DefaultSplicer<'a, R, W> { ... }
pub struct OptimalSplicer<'a, R, W> { ... }
impl<'a, R: io::Read, W: io::Write> Splicer<'a, R, W> for OptimalSplicer<'a, R, W> { ... }
pub fn splicer<'a, R: io::Read, W: io::Write>(reader: &'a mut R, writer: &'a mut W) -> OptimalSplicer<'a, R, W>;
Option 2: splice_to
/splice_from
methods, preferred if adding those fields to std::fs::File
/possibly std::net::TcpStream
won't create compatibility concerns.
// Exported from `std::io`
trait Read {
// ...
fn splice_to<W: io::Write>(&mut self, dest: &mut W) -> io::Result<usize>;
fn can_efficiently_splice_to(&self) -> bool;
}
trait Write {
// ...
fn splice_from<R: io::Read>(&mut self, src: &mut R) -> io::Result<usize> {
src.splice_to(self)
}
fn can_efficiently_splice_from(&self) -> bool;
}
pub fn splice<R: io::Read, W: io::Write>(reader: &mut R, writer: &mut W) -> io::Result<usize>;
And in either case, std::io::copy
just splicing instead of a complicated copy in its loop, with the read → write copy fallback being moved to the default splicer.
Links and related work
Drafted an RFC here that goes into much gorier detail: https://github.com/dead-claudia/rust-rfcs/blob/io-splice/text/0000-io-splice.md
Decided to file it here first rather than go straight into the RFC process. It's complex enough it'll probably go through an RFC anyways just because of all the cross-cutting concerns.
What happens now?
This issue is part of the libs-api team API change proposal process. Once this issue is filed the libs-api team will review open proposals in its weekly meeting. You should receive feedback within a week or two.