Closed
Description
Describe the bug
When the NGINX master process fails without cleaning up (kill -9 <nginx-master-pid>
), the NGF Pod cannot recover because the new NGINX container cannot start.
To Reproduce
Steps to reproduce the behavior:
- Change
runAsNonRoot
fromtrue
tofalse
indeploy/manifests/nginx-gateway.yaml
- Deploy and expose NGF
- Insert an ephemeral container into the NGF Pod using this command:
kubectl debug -it -n nginx-gateway <NGF_POD> --image=busybox:1.28 --target=nginx-gateway
- Run
kill -9 <nginx-master-PID>
in the ephemeral container - Check the logs of the nginx container in a different terminal by running:
kubectl logs -f -n nginx-gateway <NGF_POD> -c nginx
Expected behavior
The NGINX container should restart and the NGF Pod should recover.
Your environment
- Version of the NGINX Gateway Fabric -
"version":"edge","commit":"72b6c6ef8915c697626eeab88fdb6a3ce15b8da0"
- Version of Kubernetes - 1.27
- Kubernetes platform (e.g. Mini-kube or GCP) - GKE
- Details on how you expose the NGINX Gateway Fabric Pod - Loadbalancer
Additional context
Log file of nginx container showing error: