Skip to content

Subsequent call to PingContext after context timeout errors with bad connection. #858

Closed
@arvenil

Description

@arvenil

Issue description

After first call to PingContext quits with context deadline exceeded the subsequent call will return bad connection.

With plenty of debug lines I've tracked down the issue a bit. Firstly, timing is really important here and e.g. with time.Millisecond I hit the error but with time.Nanosecond each call correctly returns context deadline exceeded. So please be aware that you may need to adjust those values to be able to reproduce the issue.

When I used time.Nanosecond the context deadline exceeded is returned by sql package before connection is taken from the pool https://github.com/golang/go/blob/ce58a39fca067a19c505220c0c907ccf32793427/src/database/sql/sql.go#L1086-L1091
However with time.Millisecond, it has enough time to pass those checks and timeout is reached in this library

mysql/connection.go

Lines 603 to 606 in 99ff426

select {
default:
case <-ctx.Done():
return ctx.Err()

As of now, the context deadline exceeded error is returned and connection is returned to the pool.
Second call to PingContext now picks up the previously used connection and tries to send data but fails and returns
return errBadConnNoWrite

I've tracked it a bit, but I'm still not sure about the reason. I was able to "fix" the issue by adding mc.cleanup() before returning context deadline exceeded.

// Ping implements driver.Pinger interface
func (mc *mysqlConn) Ping(ctx context.Context) (err error) {
	if mc.closed.IsSet() {
		errLog.Print(ErrInvalidConn)
		return driver.ErrBadConn
	}

	if err = mc.watchCancel(ctx); err != nil {
		mc.cleanup() // This line fixes the issue but it leaves bad connection in the pool
		return
	}
	defer mc.finish()

	if err = mc.writeCommandPacket(comPing); err != nil {
		return
	}

	return mc.readResultOK()
}

However this leads to two calls to DB.conn() instead of one as this bad connection was left in the pool https://github.com/golang/go/blob/ce58a39fca067a19c505220c0c907ccf32793427/src/database/sql/sql.go#L730-L731

Another hack I've tried was to return driver.ErrBadConn instead context deadline exceeded. This takes out the connection out of the pool (ctx timeout doesn't remove connection from the pool) but we loose information about the real error.

And of course this raises a question - if context timeouts should we assume we can't re-use that connection anymore?

Example code

package main

import (
	"context"
	"database/sql"
	"log"
	"time"

	_ "github.com/go-sql-driver/mysql"
)

func main() {
	db, err := sql.Open("mysql", "root@/")
	if err != nil {
		panic(err)
	}
	db.SetMaxOpenConns(1)
	db.SetMaxIdleConns(1)

	for {
		ctx := context.Background()
		ctx, cancel := context.WithTimeout(ctx, time.Millisecond)
		err := db.PingContext(ctx)
		if err != nil {
			log.Println(err)
		}
		cancel()
		time.Sleep(time.Second)
	}
}

Error log

# go run ./main.go
2018/09/21 16:08:15 context deadline exceeded
2018/09/21 16:08:16 bad connection
2018/09/21 16:08:17 context deadline exceeded
2018/09/21 16:08:18 bad connection
2018/09/21 16:08:19 context deadline exceeded
2018/09/21 16:08:20 bad connection

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions