-
Notifications
You must be signed in to change notification settings - Fork 1.3k
feat(frontend): propagate upstream and CG headers to clients #63461
base: main
Are you sure you want to change the base?
Conversation
|
||
// ResponseMetadataCapture holds metadata from an HTTP response, including headers, | ||
// status code, and a function to send completion events. | ||
type ResponseMetadataCapture struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'm not a native English speaker, but it feels like Capture is not the right word to use here - maybe consider using Captor? Or, Forwarder (since what it does is forward response events and metadata)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. I like Forwarder
. Will rename it if we decide to keep this struct around.
} | ||
|
||
if rc.statusCode != nil { | ||
w.WriteHeader(*rc.statusCode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really want to rewrite upstream codes 1:1? Is this something we did before, or is this a new change introduced in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this looks a little suspect. At the very least, we should map all 5xx
errors we receive from the upstream LLM API provider to a 502
error. So it is clear that the server-side problem didn't originate from the Sourcegraph instance. But passing through 401 and 403 errors would cause similar confusion as well.
So if we do need to send the upstream status code to the client, I would urge us to consider something more robust. Like maybe even adding it as part of a header in the HTTP response or JSON response payload. Just so that it is clear what the "Sourcegraph instance" status code was, as distinct from the "upstream LLM provider" status code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is new and optional for the primary purpose of this PR. My rationale for adding this was:
- We already handle the upstream inference provider status code on the Cody Gateway side here.
- We already use the Cody Gateway API with these status codes directly for fast path.
- If the above is true, then forwarding the status code from CG to frontend to clients would mean a more consistent API contract (exact status code handling for fast path and nonfast path requests) with no new issues (because we already expose these status codes for fast path requests).
Am I missing something here? Based on your suggestions, I'm happy to remove or adjust it, but I want to calibrate my understanding first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rafal's comment applies here too. Will update ✅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TL;DR: Sounds like we are all on the same page. This is good logic to add, we just need to be a little careful is all.
To reiterate some of the things mentioned in those other threads and this one:
- We cannot copy HTTP headers indiscriminately, because it might (1) leak information we shouldn't, e.g.
CORS
, or (2) communicate unintended behavior, e.g.Connection: Close
- We simply cannot copy HTTP status codes indiscriminately for similar reasons. e.g. a
401 Unauthorized
response could either mean the Client->Sourcegraph credentials were missing or malformed, or the Sourcegraph->CodyGateway credentials were missing or malformed.
This doesn't mean that we cannot or shouldn't do this "forwarding" of headers, just that we need to be clear about what we forward and why.
So IMO, it might make sense to have an allow list of HTTP headers that we forward on in the response. (Rather than a deny list to pick out the ones we know we shouldn't, since that could send data we don't know ahead of time is problematic.)
Also, having a set of HTTP status codes we remap. e.g. how Cody Gateway maps 429->503. In the Sourcegraph backend, mapping 500-502 might make sense as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm totally onboard with us fixing this problem. And I think this is a good approach. But I don't think this is quite right in terms of how we solve things.
The ResponseMetadataCapture
design here doesn't feel quite right. The HTTP status code and headers are sent available initially, and then as new SSE events come in those are written out. So it seems awkward to have the ResponseMetadataCapture
carry those around for the "lifetime of the streaming operation".
Did you consider updating the interface for types.CompletionClient
directly? e.g. something like:
type CompletionsClient interface {
// Stream executions a completions request, streaming results to the callback.
// Callers should check for ErrStatusNotOK and handle the error appropriately.
+ //
+ // The returned response object is the _LIVE_ HTTP response from the upstream
+ // provider. So do not read from or close the response body, or else you're
+ // going to have a bad time.
- Stream(context.Context, log.Logger, CompletionRequest, SendCompletionEvent) error
+ Stream(context.Context, log.Logger, CompletionRequest, SendCompletionEvent) (*http.Response, error)
// Complete executions a completions request until done. Callers should check
// for ErrStatusNotOK and handle the error appropriately.
Complete(context.Context, log.Logger, CompletionRequest) (*CompletionResponse, error)
}
The thing I like about that approach -- or something like the function just returning (*types.StatusAndHeaders, error)
-- is that the status and header codes are only available when the call to Stream(...)
is first made. And so the data is available only when the caller needs to start writing its own http.Response
(and forwarding the upstream headers). But not unnecessarily added as part of the ongoing sendEventFn
.
What do you think?
func (rc *ResponseMetadataCapture) ApplyCapturedMetadata(w http.ResponseWriter) { | ||
for key, values := range rc.headers { | ||
for _, value := range values { | ||
w.Header().Add(key, value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit scared that the upstream might return CORS headers or other security-related headers. Can we allow-list certain headers that are interesting to us? Also, we need control over certain headers like content-type to trigger SSE, that might be different on some upstream providers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same reasoning applies here. Because we go through Cody Gateway, these filters are already applied. I verified this locally by adding Fireworks-Server-Time-To-First-Token
to hopHeaders
. Then I observed that it disappeared from the headers forwarded by frontend to the VS Code client.
I consider Cody Gateway to be the owner of filtering the inference provider headers. That's why it should be safe to forward all of them here. If there's an issue with that, should we handle it on the CG side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that hopHeaders can be added by any layer between CG and SG (Enterprise networking proxies, public internet, Cloudflare, GCP LB, Cloud Run routing (however it works)), and forwarding ex. Connection: Close
will kill all requests sharing a Cody Client -> SG connection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, makes sense. And the same applies to status codes. Will add filtering on the SG level.
Stream(context.Context, log.Logger, CompletionRequest, SendCompletionEvent) error | ||
Stream(context.Context, log.Logger, CompletionRequest, *ResponseMetadataCapture) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion, @chrsmith!
the function just returning
(*types.StatusAndHeaders, error)
I like this one! Will update 👍
Stream(context.Context, log.Logger, CompletionRequest, SendCompletionEvent) (*http.Response, error)
I didn't want to return (*http.Response, error)
because it could allow manipulation of the underlying response struct. I assumed that we intend to have the consumer receive data exclusively through the SendCompletionEvent
callback, and it's nice to force it through the function interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the function just returning
(*types.StatusAndHeaders, error)
I started implementing it locally and found another constraint that forced me to do that by providing an object to capture response metadata. We need to receive *types.StatusAndHeaders
before or with the first completion event to inform the client about the response status and headers. With the current implementation, we return after the stream is completed or when we encounter an error.
To achieve the same behavior, we need to reimplement the control flow of the Stream
function so that every completion provider can return early. Or I can use my initial approach, which is more compatible with the current implementation where we use a sendEvent
callback to yield to the consumer as soon as data is available.
That's my understanding as a non-frequent Go user 🙂. Let me know if you see a better way!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking into this @valerybugakov . Makes sense.
I didn't want to return (*http.Response, error) because it could allow manipulation of the underlying response struct.
Yup, that's a good reason not to do that.
We need to receive *types.StatusAndHeaders before or with the first completion event to inform the client about the response status and headers.
Ah yeah, I hadn't considered that case. So yes, we should have the headers and status code go into whatever that "new data type/parameter" is.
But the design still seems a little clunky IMHO, because it's trying to do two different things.
- Cache theupstream HTTP response information.
- Wrap the
SendEvent
callback so the caller will get events as they are sent from upstream.
However, thinking about some alternatives, nothing stands out as a good enough improvement. And what you have here is perfectly reasonable. So 🚀 . I'm happy to sign off on this once make a decision regarding the "which headers to forward" and/or exactly how we map status codes.
Test plan
Changelog