-
Notifications
You must be signed in to change notification settings - Fork 543
GRPCRoute timeout - GEP-3139 #3219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,186 @@ | ||
# GEP-3139: GRPCRoute Timeouts | ||
|
||
* Issue: [#3139](https://github.com/kubernetes-sigs/gateway-api/issues/3139) | ||
* Status: Implementable | ||
|
||
(See status definitions [here](/geps/overview/#gep-states).) | ||
|
||
## TLDR | ||
|
||
Similar to the HTTPRoute Timeouts (GEP # 1742), the goal of this GEP is to create a design for implementing GRPCRoute Timeouts | ||
|
||
## Goals | ||
@arkodg (original requester of this experimental feature) had the following listed in the discussion, which is a good starting point the API of GRPCRoute timeouts | ||
|
||
- The ability to set a request timeout for unary RPC | ||
- The ability to disable timeouts (set to 0s) for streaming RPC | ||
|
||
## Non-Goals | ||
|
||
Create a design for bidirectional streaming. Although this would be very useful, I propose that we leave further iteration on laying the grounds for enabling this discussion. Furthermore, we should look into streaming for HTTP, and update GEP 1742 as well. | ||
|
||
## Introduction | ||
|
||
This GEP intends to find common timeouts that we can build into the Gateway API for GRPC Route. | ||
|
||
gRPC has the following 4 cases: | ||
- Unary (single req, single res) | ||
- Client Stream (Client sends a stream of messages, server replies with a res) | ||
- Server Stream (Client sends a single req, Server replies with a stream) | ||
- Bidirectional Streaming | ||
|
||
For this initial design however, we’ll focus on unary connections, and provide room for discussion on having a streaming semantics defined for HTTP, GRPC, etc in a future iteration. | ||
|
||
Most implementations have a proxy for GRPC, as listed in the table here. From the table, implementations rely on either Envoy, Nginx, F5 BigIP, Pipy, HAProxy, Litespeed, or Traefik as their proxy in their dataplane. | ||
xtineskim marked this conversation as resolved.
Show resolved
Hide resolved
|
||
For the sake of brevity, the flow of timeouts are shown in a generic flow diagram (same diagram as [GEP 1742](https://gateway-api.sigs.k8s.io/geps/gep-1742/#flow-diagrams-with-available-timeouts)): | ||
|
||
```mermaid | ||
sequenceDiagram | ||
xtineskim marked this conversation as resolved.
Show resolved
Hide resolved
|
||
participant C as Client | ||
participant P as Proxy | ||
xtineskim marked this conversation as resolved.
Show resolved
Hide resolved
|
||
participant U as Upstream | ||
C->>P: Connection Started | ||
C->>P: Starts sending Request | ||
C->>P: Finishes Headers | ||
C->>P: Finishes request | ||
P->>U: Connection Started | ||
P->>U: Starts sending Request | ||
P->>U: Finishes request | ||
P->>U: Finishes Headers | ||
U->>P: Starts Response | ||
U->>P: Finishes Headers | ||
U->>P: Finishes Response | ||
P->>C: Starts Response | ||
P->>C: Finishes Headers | ||
P->>C: Finishes Response | ||
Note right of P: Repeat if connection sharing | ||
U->>C: Connection ended | ||
``` | ||
|
||
Some differences from HTTPRoute timeouts | ||
|
||
Noted by [@gnossen](https://github.com/kubernetes-sigs/gateway-api/discussions/3103#discussioncomment-9732739), the timeout field in a bidirectional stream is never complete, since the timer only starts after the request is finished, since the timer is never started. Envoy uses the config `grpc_timeout_header_max` in order to start the timer from when the first request message is initiated. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this sentence have a typo? It says both "the timer only starts after the request is finished" and "the timer is never started." Regardless, the intent behind my comment was that implementing this the naiive way for an Envoy data plane, the timer is never started in the case of a bidi stream because the stream never reaches the half-closed state. |
||
|
||
Nginx uses grpc_<>_timeout is used to configure of GRPC timeouts, which occurs between the proxy and upstream (`grpc_connect_timeout,grpc_send_timeout, grpc_read_timeout`) | ||
xtineskim marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## API | ||
|
||
The proxy implementations for the dataplane for the majority have some way to configure GRPC timeouts. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In addition to proxies, gRPC should be taken into consideration as a data plane on its own. |
||
|
||
### Timeout Values | ||
|
||
To remain consistent with the HTTPRoute’s timeouts, there will be the same timeout.requests and timeout.backendRequest that can be configurable. There is also a timeout.streamingRequest to capture the ability to disable timeouts for streaming RPC | ||
|
||
Unary RCP | ||
xtineskim marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Remaining consistent with HTTPRoute’s timeout values: | ||
- `timeout.requests` | ||
The timeout for the Gateway API implementation to send a res to a client GRPC request. The timer should start when connection is started, since this will ideally make sense with the stream option. This field is optional Extended support. | ||
- `timeout.backendRequest` | ||
The timeout for a single request from the gateway to upstream. This field is optional Extended support. | ||
|
||
Disabling streaming RPC | ||
- `timeout.streamingRequest` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. open to suggestions on this, I have received feedback that this sits weird There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that having this field only set to zero for disabling streaming is a strange user experience. This is very related to the GEP goals: I think we should address bidirectional streaming as well so that such a field becomes meaningful. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. agreed - darn, will likely not be in for the next release. But it makes sense, trying to define this field felt bizarre There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Following this for any implications for timeouts/retries on HTTP streaming, which likewise I feel like we don't really have a good grasp on yet... |
||
The timeout value for streaming. Currently, only the value of 0s will be allowed, but leaving this field as a string to allow for future work around bidirectional streaming timers. This field is optional Extended support. | ||
|
||
GO | ||
``` | ||
type GRPCRouteRule struct { | ||
// Timeouts defines the timeouts that can be configured for an GRPC request. | ||
// | ||
// Support: Extended | ||
// | ||
// +optional | ||
// <gateway:experimental> | ||
Timeouts *GRPCRouteTimeouts `json:"timeouts,omitempty"` | ||
|
||
// ... | ||
} | ||
|
||
// GRPCRouteTimeouts defines timeouts that can be configured for an GRPCRoute. | ||
// Timeout values are represented with Gateway API Duration formatting. | ||
// Specifying a zero value such as "0s" is interpreted as no timeout. | ||
// | ||
// +kubebuilder:validation:XValidation:message="backendRequest timeout cannot be longer than request timeout",rule="!(has(self.request) && has(self.backendRequest) && duration(self.request) != duration('0s') && duration(self.backendRequest) > duration(self.request))" | ||
type GRPCRouteTimeouts struct { | ||
// Request specifies the maximum duration for a gateway to respond to an GRPC request. | ||
xtineskim marked this conversation as resolved.
Show resolved
Hide resolved
|
||
// If the gateway has not been able to respond before this deadline is met, the gateway | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think there needs to be some work to state the semantics in terms of gRPC-native concepts. For example, you might say: " This takes into consideration protocol differences between HTTP and gRPC. Namely:
|
||
// MUST return a timeout error. | ||
// | ||
// For example, setting the `rules.timeouts.request` field to the value `10s` in an | ||
// `GRPCRoute` will cause a timeout if a client request is taking longer than 10 seconds | ||
// to complete. | ||
// | ||
// This timeout is intended to cover as close to the whole request-response transaction | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For a data plane timeout semantic, I would expect to see definitions for:
In paragraph 1, I see "If the gateway has not been able to respond before this deadline is met", which can either be interpreted as fixing the endpoint as when the client (or gateway) receives the first byte of the response. It could also be interpreted as when the stream reaches a fully closed state. In the example in paragraph 2, I see "will cause a timeout if a client request is taking longer than 10 seconds to complete". This sounds like the start of the timer is the start of the call and the end is when the client finishes sending the request? This is a very odd semantic. In paragraph 3, I see " This timeout is intended to cover as close to the whole request-response transaction as possible". This sounds like the timer starts at stream start and ends at stream close. (this is the semantic I would suggest for all arities). But then, in the same paragraph, I see "although an implementation MAY choose to start the timeout after the entire request stream has been received instead of immediately after the transaction is initiated by the client." So overall, I'm really not sure what the proposal is here. Can you help me understand the intent? |
||
// as possible although an implementation MAY choose to start the timeout after the entire | ||
// request stream has been received instead of immediately after the transaction is | ||
// initiated by the client. | ||
// | ||
// When this field is unspecified, request timeout behavior is implementation-specific. | ||
// | ||
// Support: Extended | ||
// | ||
// +optional | ||
Request *Duration `json:"request,omitempty"` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. gRPC propagates timeouts from the client to the server, and onward to further servers. This is used on the server side to cancel RPCs that surpass their timeout. Since the client will not be awaiting the result any longer, it doesn't make sense for the server to continue processing the request past the timeout. This is communicated from the client to the server via the "grpc-timeout" metadata key. If a gateway or service mesh implementation is enforcing a stricter timeout than the client itself, it makes sense to rewrite this metadata element with the shorter of the two timeouts. For example, Envoy already provides knobs to do this. I think it would be good to add this as an optional feature, perhaps with a boolean that, if set to true on an implementation that does not support it, will fail validation. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. makes sense, thank you for the links! 👍 |
||
|
||
// BackendRequest specifies a timeout for an individual request from the gateway | ||
// to a backend. This covers the time from when the request first starts being | ||
// sent from the gateway to when the full response has been received from the backend. | ||
// | ||
// An entire client GRPC transaction with a gateway, covered by the Request timeout, | ||
// may result in more than one call from the gateway to the destination backend, | ||
// for example, if automatic retries are supported. | ||
// | ||
// Because the Request timeout encompasses the BackendRequest timeout, the value of | ||
// BackendRequest must be <= the value of Request timeout. | ||
// | ||
// Support: Extended | ||
// | ||
// +optional | ||
BackendRequest *Duration `json:"backendRequest,omitempty"` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Continuing on the theme of support for the gRPC library as a data plane, this field doesn't seem to make sense in that context. I think I'm fine with having a field that only applies for implementations with proxies, but we need to specify what happens when a Gateway API implementation that does not support this field (because there is no gateway) receives this field. |
||
|
||
// StreamingRequest specifies the ability for disabling bidirectional streaming. | ||
// The only supported settings are `0s`, so users can disable timeouts for streaming | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why would only an infinite timeout be allowed? It certainly makes sense to limit the max duration of a streaming RPC. |
||
// | ||
// Support: Extended | ||
// | ||
// +optional | ||
StreamingRequest *Duration `json:"request,omitempty"` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How is the determination that an RPC is streaming vs unary made? I don't think this is possible either for gRPC library implementations or for proxies. From the perspective of the data plane, all RPCs of any arity are just streams. The only difference is that unary RPCs enter half-close after the client sends a single message and fully closed after the server sends a single response. This scenario may also happen under any of the three other arities. The only way for a data plane to make this determination would be by having access to the schema (specifically this portion) or some processed form of the schema (such as a Plumbed through the gRPC LibrarygRPC library implementations do retain information keeping track of an RPC method's arity at the highest layer of generated code, but it quickly hits a generic streaming layer that throws the information about arity because all four arities are just special cases of bidirectional streaming. So, in general, gRPC implementations simply do not have access to this information at runtime in the places in code that count and neither do proxies unless they are pre-loaded with the protobuf schema. Delivered to a Proxy via Bundled DescriptorSetsYou could bundle the schema information with the proxy and have the proxy look up the arity of individual URIs from the DescriporSet. But this only works for a certain set of RPCs which must be determined ahead of time. You would also need to orchestrate mounting the DescriptorSet into your proxy container. Depending on the Gateway API implementation, this could be quite hard. Delivered to a Proxy via gRPC ReflectionThe gRPC reflection API offers a better mechanism for delivering the structured type information than bundling a processed form. The proxy would make a networked call to a reflection server. However, this injects additional latency (though this could be reduced by caching results). This would require that all RPCs that would possibly be routed have type information stored on a single network-accessible reflection server. The proxy would of course have to be augmented with this functionality. |
||
} | ||
|
||
// Duration is a string value representing a duration in time. The format is as specified | ||
// in GEP-2257, a strict subset of the syntax parsed by Golang time.ParseDuration. | ||
// | ||
// +kubebuilder:validation:Pattern=`^([0-9]{1,5}(h|m|s|ms)){1,4}$` | ||
type Duration string | ||
``` | ||
YAML | ||
``` | ||
apiVersion: gateway.networking.k8s.io/v1beta1 | ||
kind: GRPCRoute | ||
metadata: | ||
name: timeout-example | ||
spec: | ||
... | ||
rules: | ||
- backendRefs: | ||
- name: some-service | ||
port: 8080 | ||
timeouts: | ||
request: 10s | ||
backendRequest: 2s | ||
streamRequest: 0s | ||
``` | ||
## Conformance Details | ||
The feature name for this feature is GRPCRouteTimeout, and its support is Extended. | ||
Gateway implementations can indicate support for this feautre using the following: | ||
- `GRPCRouteRequestTimeount` | ||
- `GRPCRouteRequestBackendTimeout` | ||
- `GRPCRouteStreamingRequestTimeout` | ||
|
||
|
||
## Alternatives | ||
|
||
|
||
## References | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
apiVersion: internal.gateway.networking.k8s.io/v1alpha1 | ||
kind: GEPDetails | ||
number: 3139 | ||
name: GRPCRoute Timeouts | ||
status: Implementable | ||
authors: | ||
- xtine | ||
relationships: | ||
extendedBy: | ||
- number: 2257 | ||
name: Gateway API Duration Format | ||
description: Adds a duration format for us in timeouts. |
Uh oh!
There was an error while loading. Please reload this page.