Skip to content

NGF Pod cannot recover if NGINX master process fails without cleaning up #1108

Closed
@bjee19

Description

@bjee19

Describe the bug
When the NGINX master process fails without cleaning up (kill -9 <nginx-master-pid>), the NGF Pod cannot recover because the new NGINX container cannot start.

To Reproduce
Steps to reproduce the behavior:

  1. Change runAsNonRoot from true to false in deploy/manifests/nginx-gateway.yaml
  2. Deploy and expose NGF
  3. Insert an ephemeral container into the NGF Pod using this command: kubectl debug -it -n nginx-gateway <NGF_POD> --image=busybox:1.28 --target=nginx-gateway
  4. Run kill -9 <nginx-master-PID> in the ephemeral container
  5. Check the logs of the nginx container in a different terminal by running: kubectl logs -f -n nginx-gateway <NGF_POD> -c nginx

Expected behavior
The NGINX container should restart and the NGF Pod should recover.

Your environment

  • Version of the NGINX Gateway Fabric - "version":"edge","commit":"72b6c6ef8915c697626eeab88fdb6a3ce15b8da0"
  • Version of Kubernetes - 1.27
  • Kubernetes platform (e.g. Mini-kube or GCP) - GKE
  • Details on how you expose the NGINX Gateway Fabric Pod - Loadbalancer

Additional context
Log file of nginx container showing error:
image

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions