Only one HTTP2 Ping is sent with keep alive enabled

I believe there is a bug in the HTTP2 keep alive implementation, more concretely in `src/proto/h2/ping.rs`.

This issue was originally detected in https://github.com/hyperium/tonic/issues/474, credits to @alce.

It seems that when enabling HTTP2 server keep alive, only the first Ping is sent after the configured interval.
Afterwards, no more Pings seem to be sent.

## How to reproduce

I'm applying the following changes to the `echo` example:

```diff
--- a/examples/echo.rs
+++ b/examples/echo.rs
@@ -14,7 +14,10 @@ async fn echo(req: Request<Body>) -> Result<Response<Body>, hyper::Er>
         ))),
 
         // Simply echo the body back to the client.
-        (&Method::POST, "/echo") => Ok(Response::new(req.into_body())),
+        (&Method::POST, "/echo") => {
+           tokio::time::delay_for(std::time::Duration::from_secs(60)).await;
+           Ok(Response::new(req.into_body()))
+       },
 
         // Convert to uppercase before sending back to client using a stream.
         (&Method::POST, "/echo/uppercase") => {
@@ -55,7 +58,10 @@ async fn main() -> Result<(), Box<dyn std::error::Error + Send + Sync>
 
     let service = make_service_fn(|_| async { Ok::<_, hyper::Error>(service_fn(echo)) }>
 
-    let server = Server::bind(&addr).serve(service);
+    let server = Server::bind(&addr)
+       .http2_keep_alive_interval(std::time::Duration::from_secs(10))
+       .http2_keep_alive_timeout(std::time::Duration::from_secs(5))
+       .serve(service);
 
     println!("Listening on http://{}", addr);
```

Running the example with `cargo run --example echo`, sniffing with `tcpdump -i lo port 3000`, and making a request with `curl -v --http2-prior-knowledge http://localhost:3000/echo --data "foo"`, I get the following:

```tcpdump
20:03:21.621105 IP localhost.hbci > localhost.46506: Flags [P.], seq 22:44, ack 160, win 512, options [nop,nop,TS val 2305866698 ecr 2305866696], length 22
20:03:21.621120 IP localhost.46506 > localhost.hbci: Flags [.], ack 44, win 512, options [nop,nop,TS val 2305866698 ecr 2305866698], length 0

20:03:31.623825 IP localhost.hbci > localhost.46506: Flags [P.], seq 44:61, ack 160, win 512, options [nop,nop,TS val 2305876700 ecr 2305866698], length 17
20:03:31.623854 IP localhost.46506 > localhost.hbci: Flags [.], ack 61, win 512, options [nop,nop,TS val 2305876700 ecr 2305876700], length 0
20:03:31.623926 IP localhost.46506 > localhost.hbci: Flags [P.], seq 160:177, ack 61, win 512, options [nop,nop,TS val 2305876701 ecr 2305876700], length 17
20:03:31.623948 IP localhost.hbci > localhost.46506: Flags [.], ack 177, win 512, options [nop,nop,TS val 2305876701 ecr 2305876701], length 0

20:04:21.624549 IP localhost.hbci > localhost.46506: Flags [P.], seq 61:111, ack 177, win 512, options [nop,nop,TS val 2305926701 ecr 2305876701], length 50
20:04:21.624593 IP localhost.46506 > localhost.hbci: Flags [.], ack 111, win 512, options [nop,nop,TS val 2305926701 ecr 2305926701], length 0
20:04:21.625177 IP localhost.46506 > localhost.hbci: Flags [F.], seq 177, ack 111, win 512, options [nop,nop,TS val 2305926702 ecr 2305926701], length 0
20:04:21.625825 IP localhost.hbci > localhost.46506: Flags [F.], seq 111, ack 178, win 512, options [nop,nop,TS val 2305926702 ecr 2305926702], length 0
20:04:21.625858 IP localhost.46506 > localhost.hbci: Flags [.], ack 112, win 512, options [nop,nop,TS val 2305926702 ecr 2305926702], length 0
```

The **first batch** of packets belongs to the start of the request, the **second batch** is the first Ping+Pong, the **third batch** is the end of the request. As you can see the keep alive interval of 10seconds separates the first two batches. However no more pings occur between the second and last batch, as I would expect.

After taking a look at `src/proto/h2/ping.rs` and debugging a bit, it seems that once `KeepAlive` enters the state `KeepAliveState::PingSent` it does not go back to `KeepAliveState::Scheduled`, even after a Pong is received.

A possible fix would be to check if `ping_sent_at` has been cleared when the `KeepAlive::scheduled()` method is called:

```diff
--- a/src/proto/h2/ping.rs
+++ b/src/proto/h2/ping.rs
@@ -442,7 +442,16 @@ impl KeepAlive {
                 let interval = shared.last_read_at() + self.interval;
                 self.timer.reset(interval);
             }
-            KeepAliveState::Scheduled | KeepAliveState::PingSent => (),
+            KeepAliveState::PingSent => {
+                if shared.is_ping_sent() {
+                    return;
+                }
+
+                self.state = KeepAliveState::Scheduled;
+                let interval = shared.last_read_at() + self.interval;
+                self.timer.reset(interval);
+            }
+            KeepAliveState::Scheduled => (),
         }
     }
```

I'm not sure if this was the intended behavior of the original implementation.
I'll be happy to submit a PR if agreed.

Thanks in advance and keep up the good work!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Only one HTTP2 Ping is sent with keep alive enabled #2310

How to reproduce

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Only one HTTP2 Ping is sent with keep alive enabled #2310

Description

How to reproduce

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions