Description
Issue description
This is similar to #1392 but much harder to reproduce and causes real failures.
#1392 describes the situation where the "new" ER_CLIENT_INTERACTION_TIMEOUT
error packet is received on an idle connection. However, if you're really unlucky then it's also possible for this packet to be received just as a query is being sent. This results in an ErrPktSync
("commands out of sync. You can't run this command now") error, as the sequence ID of the error packet doesn't correspond to the packet that was simultaneously written. We have seen this in at least two very busy services, and only when there are many connections hitting the wait_timeout
.
What's more, this doesn't mark the connection as bad and leaves it in a "busy buffer" state, which likely causes at least one further query to fail.
I don't have any great suggestions for fixing this. Perhaps the driver could recognise that it's an error packet with a zero sequence ID, and return ErrBadConn
in that case. At the very least, it should be able to expose the error rather than giving this "commands out of sync" error.
For any readers, we fixed this by just decreasing the application level connection lifetime to be much less than wait_timeout
.
Example code
Use https://github.com/lukaszlach/docker-tc to apply a network delay to the MySQL container:
docker run -d \
--name docker-tc \
--network host \
--cap-add NET_ADMIN \
--restart always \
-v /var/run/docker.sock:/var/run/docker.sock \
-v /var/docker-tc:/var/docker-tc \
lukaszlach/docker-tc
Start the MySQL container in a network, with a 1 second network delay applied:
docker network create test-net
docker run \
-e MYSQL_ROOT_PASSWORD=password \
--label "com.docker-tc.enabled=1" \
--label "com.docker-tc.delay=1000ms" \
-p 3306:3306 \
--net test-net \
mysql:8.0.24
Then this will reproduce it fairly consistently:
package main
import (
"database/sql"
"fmt"
"time"
"github.com/go-sql-driver/mysql"
)
func main() {
mysql.SetLogger(logger{})
cfg := mysql.NewConfig()
cfg.User = "root"
cfg.Passwd = "password"
cfg.Net = "tcp"
cfg.Addr = "127.0.0.1"
cfg.DBName = "mysql"
db, err := sql.Open("mysql", cfg.FormatDSN())
if err != nil {
panic(err)
}
defer db.Close()
db.SetMaxOpenConns(1)
if _, err := db.Exec("SET GLOBAL wait_timeout = 2"); err != nil {
panic(err)
}
time.Sleep(1500 * time.Millisecond)
rows, err := db.Query("SHOW TABLES")
if err != nil {
fmt.Println("query failed:", err)
} else {
rows.Close()
}
}
type logger struct{}
func (logger) Print(v ...interface{}) {
fmt.Println(append([]interface{}{"mysql: "}, v...)...)
}
Error log
mysql: busy buffer
panic: commands out of sync. You can't run this command now
Configuration
Driver version (or git SHA): 1.7.0
Go version: 1.19.5
Server version: MySQL 8.0.24
Server OS: E.g. Debian 8.1 (Jessie), Windows 10