Skip to content

Shutting down non-leader pod starts leader jobs #1738

Closed
@pleshakov

Description

@pleshakov

Describe the bug
If a non-leader pod gets shutdown, during shutdown for some reasons leader jobs like telemetry reporting or status updating are started.

To Reproduce

  • Deploy NGF with multiple replicas
  • Watch (kubect logs -f) logs of a non-leader pod.
  • Shutdown the pod by kubectl delete pod
  • See in the logs errors like below:
{"level":"info","ts":"2024-03-20T19:28:58Z","msg":"Stopping and waiting for non leader election runnables"}
{"level":"info","ts":"2024-03-20T19:28:58Z","msg":"Shutdown signal received, waiting for all workers to finish","controller":"secret","controllerGroup":"","controllerKind":"Secret"}
. . .
{"level":"info","ts":"2024-03-20T19:28:58Z","msg":"Stopping and waiting for leader election runnables"}

(the two lines below should correspond to status updater, which should not have been kicked off)

{"level":"info","ts":"2024-03-20T19:28:58Z","logger":"statusUpdater","msg":"Writing last statuses"}
{"level":"info","ts":"2024-03-20T19:28:58Z","logger":"statusUpdater","msg":"Updating Gateway API statuses"}
. . .

(two lines below correspond to telemetry reported, which should not have been started)

{"level":"info","ts":"2024-03-20T19:28:58Z","logger":"telemetryJob","msg":"Starting cronjob"}
{"level":"error","ts":"2024-03-20T19:28:58Z","logger":"telemetryJob","msg":"Failed to collect telemetry data"," ...
. . .
{"level":"info","ts":"2024-03-20T19:28:58Z","logger":"telemetryJob","msg":"Stopping cronjob"}
. . .
{"level":"info","ts":"2024-03-20T19:28:58Z","msg":"Wait completed, proceeding to shutdown the manager"}

Expected behavior

leader jobs should not start during shutdown of a non-leader pod.

Your environment
NGF - edge, 5b13734

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingtrackingTo track external issues or changes that will affect NKG

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions