Skip to content

Only one HTTP2 Ping is sent with keep alive enabled #2310

Closed
differs/Legends
#5
@pdcalado

Description

@pdcalado

I believe there is a bug in the HTTP2 keep alive implementation, more concretely in src/proto/h2/ping.rs.

This issue was originally detected in hyperium/tonic#474, credits to @alce.

It seems that when enabling HTTP2 server keep alive, only the first Ping is sent after the configured interval.
Afterwards, no more Pings seem to be sent.

How to reproduce

I'm applying the following changes to the echo example:

--- a/examples/echo.rs
+++ b/examples/echo.rs
@@ -14,7 +14,10 @@ async fn echo(req: Request<Body>) -> Result<Response<Body>, hyper::Er>
         ))),
 
         // Simply echo the body back to the client.
-        (&Method::POST, "/echo") => Ok(Response::new(req.into_body())),
+        (&Method::POST, "/echo") => {
+           tokio::time::delay_for(std::time::Duration::from_secs(60)).await;
+           Ok(Response::new(req.into_body()))
+       },
 
         // Convert to uppercase before sending back to client using a stream.
         (&Method::POST, "/echo/uppercase") => {
@@ -55,7 +58,10 @@ async fn main() -> Result<(), Box<dyn std::error::Error + Send + Sync>
 
     let service = make_service_fn(|_| async { Ok::<_, hyper::Error>(service_fn(echo)) }>
 
-    let server = Server::bind(&addr).serve(service);
+    let server = Server::bind(&addr)
+       .http2_keep_alive_interval(std::time::Duration::from_secs(10))
+       .http2_keep_alive_timeout(std::time::Duration::from_secs(5))
+       .serve(service);
 
     println!("Listening on http://{}", addr);

Running the example with cargo run --example echo, sniffing with tcpdump -i lo port 3000, and making a request with curl -v --http2-prior-knowledge http://localhost:3000/echo --data "foo", I get the following:

20:03:21.621105 IP localhost.hbci > localhost.46506: Flags [P.], seq 22:44, ack 160, win 512, options [nop,nop,TS val 2305866698 ecr 2305866696], length 22
20:03:21.621120 IP localhost.46506 > localhost.hbci: Flags [.], ack 44, win 512, options [nop,nop,TS val 2305866698 ecr 2305866698], length 0

20:03:31.623825 IP localhost.hbci > localhost.46506: Flags [P.], seq 44:61, ack 160, win 512, options [nop,nop,TS val 2305876700 ecr 2305866698], length 17
20:03:31.623854 IP localhost.46506 > localhost.hbci: Flags [.], ack 61, win 512, options [nop,nop,TS val 2305876700 ecr 2305876700], length 0
20:03:31.623926 IP localhost.46506 > localhost.hbci: Flags [P.], seq 160:177, ack 61, win 512, options [nop,nop,TS val 2305876701 ecr 2305876700], length 17
20:03:31.623948 IP localhost.hbci > localhost.46506: Flags [.], ack 177, win 512, options [nop,nop,TS val 2305876701 ecr 2305876701], length 0

20:04:21.624549 IP localhost.hbci > localhost.46506: Flags [P.], seq 61:111, ack 177, win 512, options [nop,nop,TS val 2305926701 ecr 2305876701], length 50
20:04:21.624593 IP localhost.46506 > localhost.hbci: Flags [.], ack 111, win 512, options [nop,nop,TS val 2305926701 ecr 2305926701], length 0
20:04:21.625177 IP localhost.46506 > localhost.hbci: Flags [F.], seq 177, ack 111, win 512, options [nop,nop,TS val 2305926702 ecr 2305926701], length 0
20:04:21.625825 IP localhost.hbci > localhost.46506: Flags [F.], seq 111, ack 178, win 512, options [nop,nop,TS val 2305926702 ecr 2305926702], length 0
20:04:21.625858 IP localhost.46506 > localhost.hbci: Flags [.], ack 112, win 512, options [nop,nop,TS val 2305926702 ecr 2305926702], length 0

The first batch of packets belongs to the start of the request, the second batch is the first Ping+Pong, the third batch is the end of the request. As you can see the keep alive interval of 10seconds separates the first two batches. However no more pings occur between the second and last batch, as I would expect.

After taking a look at src/proto/h2/ping.rs and debugging a bit, it seems that once KeepAlive enters the state KeepAliveState::PingSent it does not go back to KeepAliveState::Scheduled, even after a Pong is received.

A possible fix would be to check if ping_sent_at has been cleared when the KeepAlive::scheduled() method is called:

--- a/src/proto/h2/ping.rs
+++ b/src/proto/h2/ping.rs
@@ -442,7 +442,16 @@ impl KeepAlive {
                 let interval = shared.last_read_at() + self.interval;
                 self.timer.reset(interval);
             }
-            KeepAliveState::Scheduled | KeepAliveState::PingSent => (),
+            KeepAliveState::PingSent => {
+                if shared.is_ping_sent() {
+                    return;
+                }
+
+                self.state = KeepAliveState::Scheduled;
+                let interval = shared.last_read_at() + self.interval;
+                self.timer.reset(interval);
+            }
+            KeepAliveState::Scheduled => (),
         }
     }

I'm not sure if this was the intended behavior of the original implementation.
I'll be happy to submit a PR if agreed.

Thanks in advance and keep up the good work!

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-http2Area: HTTP/2 specific.C-bugCategory: bug. Something is wrong. This is bad!

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions