Description
Description
We use cluster-mode with redis for sharded pub-sub (we have 3 masters and 3 replicas in a kubernetes cluster).
We have the following args for the clients:
const clusterArgs = {
rootNodes: [
{
url: `redis://${REDIS_CLUSTER_PUBSUB_HOST}:${redisPort}`,
},
],
defaults: {
username: REDIS_CLUSTER_PUBSUB_NAME,
password: REDIS_CLUSTER_PUBSUB_PASS,
socket: {
reconnectStrategy(retries: number) {
if (retries >= 10) {
console.error(
`lost connection to redis cluster-pubsub cluster: tried ${retries} times`
);
} else {
console.warn(
`retrying redis cluster-pubsub cluster connection: tried ${retries} times`
);
}
// reconnect after
return Math.min(retries * 200, 2000);
},
connectTimeout: 10000,
keepAlive: 60000,
},
},
};
and then we create the client(s) like this:
const client = createCluster(clusterArgs);
await client.connect();
client.on('error', (err) => {
console.error(`[PUB-SUB ERROR]: ${err}`);
});
Sometimes our redis pub-sub cluster goes down (i.e. for maintenance, when we upgrade to a new version, since we run it in kubernetes), and we'll receive the following error:
Error: Socket closed unexpectedly
We correctly log the error by catching it in the error handler, but we never seem to retry / reconnect -- the only way I can get a reconnect to actually happen is to continually restart the process until the reconnection succeeds.
Also, if the process tries to issue a command, we sometimes get an internal error killing the process because of a node uncaught exception, even though I've added a client.on('error')
above.
I followed the findings from #2120 and #2302, but those don't really seem to solve our problems.
What I'd like is to be able to specify a reconnect strategy so that we continually try to retry (according to the reconnectStrategy
) if we lose our TLS connection / fail to talk to a node in the cluster. Also, I'd like that we actually queue messages when we're offline instead of throwing an error and taking down the process.
Node.js Version
20.11.1
Redis Server Version
7.0.10
Node Redis Version
4.6.13
Platform
linux
Logs
No response