Skip to content

Connection stuck when "keep alive" is used #1439

Closed
@idubrov

Description

@idubrov

I have a simple client that sends requests in chunks of 5 and after a while this client stucks.

The client looks like this:

  let mut core = tokio_core::reactor::Core::new().unwrap();
  let handle = core.handle();
  let items = ... // long list of Hyper requests
  let client = Client::new(&handle);
  let requests = stream::iter_result(items)
    .chunks(concurrency)
    .for_each(|reqs| {
      info!("Running {} tasks ...", concurrency);
      let vec = reqs
        .into_iter()
        .map(|(path, request)| process_request(&client, &path, request))
        .collect::<Vec<_>>();
      future::join_all(vec).map(|_results| ())
    });

  core.run(requests).unwrap();

Debugger shows that it waits for the mio event (sits in mio::sys::unix::kqueue::Selector::select):

* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  * frame #0: 0x0000000106b7bd96 libsystem_kernel.dylib`kevent + 10
    frame #1: 0x0000000104f94c03 piston`mio::sys::unix::kqueue::Selector::select::h5908d164b552abc8(self=0x0000000106e33020, evts=0x00007fff5bca9f48, awakener=(__0 = 18446744073709551615), timeout=<unavailable>) at kqueue.rs:84
    frame #2: 0x0000000104f8e2fa piston`mio::poll::Poll::poll2::h47cfb2893a485481(self=0x0000000106e33020, events=0x00007fff5bca9f48, timeout=Option<std::time::duration::Duration> @ 0x00007fff5bca8ce8) at poll.rs:1161
    frame #3: 0x0000000104f8e026 piston`mio::poll::Poll::poll::h1c5a4889decc5479(self=0x0000000106e33020, events=0x00007fff5bca9f48, timeout=Option<std::time::duration::Duration> @ 0x00007fff5bca91a0) at poll.rs:1125
    frame #4: 0x0000000104f6a90b piston`tokio_core::reactor::Core::poll::h355d296d2c61bccc(self=0x00007fff5bca9f48, max_wait=Option<std::time::duration::Duration> @ 0x00007fff5bca9e18) at mod.rs:276
    frame #5: 0x0000000104354499 piston`tokio_core::reactor::Core::run::h3e432d0bd23cd991(self=0x00007fff5bca9f48, f=<unavailable>) at mod.rs:241
    frame #6: 0x000000010481faad piston`piston::submit::execute::h35469783a8e181fc(matches=0x0000000106e7c018) at submit.rs:231

Analyzing logs shows that the following events seem to be causing the issue:

// First, I see that token of interest receives Readable | Hup event (actual token number is 0, since this is x*2+2):
TRACE tokio_core::reactor               > event Readable | Hup Token(2)
// I added some debugging to tokio-core dispatch_io to print status of io.reader/io.writer, here you can see that reader task is not none:
DEBUG tokio_core::reactor > TKIO: is readable writer.is_none() == true, reader.is_none() == false
// Client writes some data to the connection (token 0 is the same as Token(2) above, as (2-2)/2 = 0
DEBUG tokio_core::net::tcp > DBG: WRITE BUFS: (token 0) Ok(63817) <-- this one I added to AsyncWrite.write_buf
DEBUG hyper::proto::h1::io > flushed 63817 bytes
// Finally, this is where error is returned from mio (error code is 54, ECONNRESET)
// This is normal token, it works just fine (has more logging output after that line)
TRACE tokio_core::reactor > event Writable Token(6)
// reader task is present for this token
DEBUG tokio_core::reactor > TKIO: is writable writer.is_none() == true, reader.is_none() == false
// However, this token receives Error event and it is the last line in logs related to it.
TRACE tokio_core::reactor > event Readable | Writable | Error | Hup Token(2)
// No reader task to notify? -- reader.is_none() is true.
DEBUG tokio_core::reactor > TKIO: is writable writer.is_none() == true, reader.is_none() == true```

I am still researching it, but it does look like it is related to "keep alive" feature.

It seems like Hyper does not like it when remote side closes the socket around the time the connection is taken from the "idle" and used for the next request.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions