Skip to content

[1.17.0] Context deadline exceeded (timeout) cloning large repo #20680

Closed
@parnic

Description

@parnic

Description

Starting with v1.17.0, one of our large repos (70+ GB, using LFS) is failing trying to clone on some connections.

When the clone fails, the client reports some variation of:

error: 24576 bytes of body are still expected0 MiB | 1.83 MiB/s
fetch-pack: unexpected disconnect while reading sideband packet
fatal: early EOF
fatal: fetch-pack: invalid index-pack output

and the server logs show, e.g.:

2022/08/05 03:27:25 ...ers/web/repo/http.go:484:serviceRPC() [E] [62ec8cb5-3] Fail to serve RPC(upload-pack) in /mnt/gitea/data/repositories/org/repo.git: context deadline exceeded -

I tracked this down to a change in commit 35fdefc for #18363 where serviceRPC() was no longer allowed to run indefinitely, but carries a context with an unchangeable default timeout of 360 seconds (once I found this, I confirmed that our clones were dying around the 6 minute mark).

edit: collapsed this workaround because a PR is up with a different approach

We are locally using a custom build with this quick patch in order to set an extremely large timeout to allow us to work around the issue:

diff --git a/modules/setting/setting.go b/modules/setting/setting.go
index f5e624946..585c4589c 100644
--- a/modules/setting/setting.go
+++ b/modules/setting/setting.go
@@ -125,6 +125,7 @@ var (
        StartupTimeout       time.Duration
        PerWriteTimeout      = 30 * time.Second
        PerWritePerKbTimeout = 10 * time.Second
+       ServiceRpcTimeout    time.Duration
        StaticURLPrefix      string
        AbsoluteAssetURL     string

@@ -723,6 +724,7 @@ func loadFromConf(allowEmpty bool, extraConfig string) {
        StartupTimeout = sec.Key("STARTUP_TIMEOUT").MustDuration(0 * time.Second)
        PerWriteTimeout = sec.Key("PER_WRITE_TIMEOUT").MustDuration(PerWriteTimeout)
        PerWritePerKbTimeout = sec.Key("PER_WRITE_PER_KB_TIMEOUT").MustDuration(PerWritePerKbTimeout)
+       ServiceRpcTimeout = sec.Key("SERVICE_RPC_TIMEOUT").MustDuration(ServiceRpcTimeout)

        defaultAppURL := string(Protocol) + "://" + Domain
        if (Protocol == HTTP && HTTPPort != "80") || (Protocol == HTTPS && HTTPPort != "443") {
diff --git a/routers/web/repo/http.go b/routers/web/repo/http.go
index 6a85bca16..82835cb62 100644
--- a/routers/web/repo/http.go
+++ b/routers/web/repo/http.go
@@ -474,11 +474,12 @@ func serviceRPC(ctx gocontext.Context, h serviceHandler, service string) {
        cmd := git.NewCommand(h.r.Context(), service, "--stateless-rpc", h.dir)
        cmd.SetDescription(fmt.Sprintf("%s %s %s [repo_path: %s]", git.GitExecutable, service, "--stateless-rpc", h.dir))
        if err := cmd.Run(&git.RunOpts{
-               Dir:    h.dir,
-               Env:    append(os.Environ(), h.environ...),
-               Stdout: h.w,
-               Stdin:  reqBody,
-               Stderr: &stderr,
+               Dir:     h.dir,
+               Env:     append(os.Environ(), h.environ...),
+               Stdout:  h.w,
+               Stdin:   reqBody,
+               Stderr:  &stderr,
+               Timeout: setting.ServiceRpcTimeout,
        }); err != nil {
                if err.Error() != "signal: killed" {
                        log.Error("Fail to serve RPC(%s) in %s: %v - %s", service, h.dir, err, stderr.String())

I can make a PR out of this if this is desired, but I'd kind of rather this go back to the old behavior instead of adding a new 6-minute timeout footgun where, even if targeted logging was added to point people to correcting the failure, admins would need to find this config variable and change it for any decently-large repo. 1.16.x's behavior was much preferred.

Gitea Version

1.17.0

Can you reproduce the bug on the Gitea demo site?

No

Log Gist

No response

Screenshots

No response

Git Version

2.37.1

Operating System

Ubuntu 20.04 aarch64

How are you running Gitea?

Linux arm64 release on an AWS instance

Database

PostgreSQL

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions