Skip to content

Implement clone() for TcpStream, UnixStream, and UdpSocket #11894

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 5, 2014

Conversation

alexcrichton
Copy link
Member

This is part of the overall strategy I would like to take when approaching
issue #11165. The only two I/O objects that reasonably want to be "split" are
the network stream objects. Everything else can be "split" by just creating
another version.

The initial idea I had was the literally split the object into a reader and a
writer half, but that would just introduce lots of clutter with extra interfaces
that were a little unnnecssary, or it would return a ~Reader and a ~Writer which
means you couldn't access things like the remote peer name or local socket name.

The solution I found to be nicer was to just clone the stream itself. The clone
is just a clone of the handle, nothing fancy going on at the kernel level.
Conceptually I found this very easy to wrap my head around (everything else
supports clone()), and it solved the "split" problem at the same time.

The cloning support is pretty specific per platform/lib combination:

  • native/win32 - uses some specific WSA apis to clone the SOCKET handle
  • native/unix - uses dup() to get another file descriptor
  • green/all - This is where things get interesting. When we support full clones
    of a handle, this implies that we're allowing simultaneous writes
    and reads to happen. It turns out that libuv doesn't support two
    simultaneous reads or writes of the same object. It does support
    one read and one write at the same time, however. Some extra
    infrastructure was added to just block concurrent writers/readers
    until the previous read/write operation was completed.

I've added tests to the tcp/unix modules to make sure that this functionality is
supported everywhere.

@alexcrichton
Copy link
Member Author

Another possible name is dup as suggested by @kballard because this clone isn't the traditional clone in that this version can fail.

@lilyball
Copy link
Contributor

cc me

@bnoordhuis
Copy link
Contributor

native/unix - uses dup() to get another file descriptor

I don't know if it's a design consideration but dup() and dup2() are not fork-safe.

If another thread calls fork() between the calls to dup() and fcntl(FD_CLOEXEC) or ioctl(FIOCLEX), then the file descriptor leaks into the new process. You get strange bugs that way[1] and it's a security loophole.

dup3() on Linux and FreeBSD >= 10 help mitigate that. POSIX.1-2008 has fcntl(F_DUPFD_CLOEXEC).

The issue is not unique to dup(), of course. open(), accept(), revcmsg(SCM_RIGHTS), etc. all have the same flaw.

[1] For example, if the parent process adds the file descriptor to an epoll set and subsequently closes it but the file descriptor is still alive in the child process, then epoll_wait() will still report events for the closed file descriptor. Worse, you cannot remove the file descriptor because, hey, it's closed - epoll_ctl() returns EBADF.

@alexcrichton
Copy link
Member Author

Right now the runtime isn't fork-safe for a variety of reasons, but it it questionable whether we want to continue to add reasons why it's not fork-safe.

@bill-myers
Copy link
Contributor

At first glance, it seems that it would be faster and simpler to put the fd/SOCKET in an Arc and cloning that for cheap instead of making a system call and consuming one fd table slot (note that fds are generally limited, often as low as 1024 per process).

The overhead of accessing the Arc in the uncloned case can be avoided by using an enum where one variant is the plain fd and the other is an Arc.

Also, this allows to implement a clone() that never fails.

s1.write([2]);

p.recv();
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent tests.

@derekchiang
Copy link
Contributor

What's the status of this?

@alexcrichton
Copy link
Member Author

I'm going to take some time to implement @bill-myers's suggestion of an Arc, and then I'll merge.

This is part of the overall strategy I would like to take when approaching
issue rust-lang#11165. The only two I/O objects that reasonably want to be "split" are
the network stream objects. Everything else can be "split" by just creating
another version.

The initial idea I had was the literally split the object into a reader and a
writer half, but that would just introduce lots of clutter with extra interfaces
that were a little unnnecssary, or it would return a ~Reader and a ~Writer which
means you couldn't access things like the remote peer name or local socket name.

The solution I found to be nicer was to just clone the stream itself. The clone
is just a clone of the handle, nothing fancy going on at the kernel level.
Conceptually I found this very easy to wrap my head around (everything else
supports clone()), and it solved the "split" problem at the same time.

The cloning support is pretty specific per platform/lib combination:

* native/win32 - uses some specific WSA apis to clone the SOCKET handle
* native/unix - uses dup() to get another file descriptor
* green/all - This is where things get interesting. When we support full clones
              of a handle, this implies that we're allowing simultaneous writes
              and reads to happen. It turns out that libuv doesn't support two
              simultaneous reads or writes of the same object. It does support
              *one* read and *one* write at the same time, however. Some extra
              infrastructure was added to just block concurrent writers/readers
              until the previous read/write operation was completed.

I've added tests to the tcp/unix modules to make sure that this functionality is
supported everywhere.
bors added a commit that referenced this pull request Feb 5, 2014
This is part of the overall strategy I would like to take when approaching
issue #11165. The only two I/O objects that reasonably want to be "split" are
the network stream objects. Everything else can be "split" by just creating
another version.

The initial idea I had was the literally split the object into a reader and a
writer half, but that would just introduce lots of clutter with extra interfaces
that were a little unnnecssary, or it would return a ~Reader and a ~Writer which
means you couldn't access things like the remote peer name or local socket name.

The solution I found to be nicer was to just clone the stream itself. The clone
is just a clone of the handle, nothing fancy going on at the kernel level.
Conceptually I found this very easy to wrap my head around (everything else
supports clone()), and it solved the "split" problem at the same time.

The cloning support is pretty specific per platform/lib combination:

* native/win32 - uses some specific WSA apis to clone the SOCKET handle
* native/unix - uses dup() to get another file descriptor
* green/all - This is where things get interesting. When we support full clones
              of a handle, this implies that we're allowing simultaneous writes
              and reads to happen. It turns out that libuv doesn't support two
              simultaneous reads or writes of the same object. It does support
              *one* read and *one* write at the same time, however. Some extra
              infrastructure was added to just block concurrent writers/readers
              until the previous read/write operation was completed.

I've added tests to the tcp/unix modules to make sure that this functionality is
supported everywhere.
@bors bors closed this Feb 5, 2014
@bors bors merged commit 56080c4 into rust-lang:master Feb 5, 2014
@alexcrichton alexcrichton deleted the io-clone branch February 5, 2014 22:06
@dbrodie
Copy link

dbrodie commented Mar 30, 2014

Just out of curiosity, is there a reason why this isn't implemented with the shutdown system call to implement half-open connection?

@alexcrichton
Copy link
Member Author

See #12855 for some more details, but in short this is understood as having some more drawbacks. We're considering adding helper methods for these related functions, but we want to make sure that it's an interface that we're willing to support in the future as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants