Skip to content

Commit a871c8f

Browse files
committed
fix structure of the document
1 parent eb16999 commit a871c8f

File tree

1 file changed

+81
-85
lines changed

1 file changed

+81
-85
lines changed

site/content/how-to/monitoring/troubleshooting.md

+81-85
Original file line numberDiff line numberDiff line change
@@ -10,17 +10,6 @@ docs: "DOCS-1419"
1010
This topic describes possible issues users might encounter when using NGINX Gateway Fabric. When possible, suggested workarounds are provided.
1111

1212

13-
### Common Issues
14-
15-
{{< bootstrap-table "table table-striped table-bordered" >}}
16-
| Problem Area | Symptom | Troubleshooting Method | Common Cause |
17-
|------------------------------|----------------------------------------|------------------------------------------------------ |--------------------------------------------------------|
18-
| Startup | NGINX Gateway Fabric fails to start. | Check logs for _nginx_ and _nginx-gateway_ container. | Missing default server TLS secret. |
19-
| Resources not configured | Status missing on resources. | Check referenced resources. | Referenced resources not belong to Gateway Fabric. |
20-
| NGINX errors | Reload failures on NGINX | Fix permissions for control plane | Security context not configured. |
21-
| Usage reporting | Errors logs related to usage reporting | Enable usage reporting | Usage reporting disabled. |
22-
{{< /bootstrap-table >}}
23-
2413
### General troubleshooting
2514

2615
When investigating a problem or requesting help, there are important data points that can be collected to help understand what issues may exist.
@@ -91,23 +80,25 @@ kubectl exec -it -n nginx-gateway <ngf-pod-name> -c nginx /bin/sh
9180

9281
Logs from the NGINX Gateway Fabric control plane and data plane can contain information that isn't available to status or events. These can include errors in processing or passing traffic.
9382

94-
1. To see logs for the:
83+
##### Container Logs
9584

96-
- Control plane container
85+
To see logs for Control plane container:
9786

9887
```shell
9988
kubectl -n nginx-gateway logs <ngf-pod-name> -c nginx-gateway
10089
```
10190

102-
- Data plane container
91+
To see logs for Data plane container:
10392

10493
```shell
10594
kubectl -n nginx-gateway logs <ngf-pod-name> -c nginx
10695
```
10796

108-
1. To only see error logs for control plane and data plane containers:
97+
##### Error Logs
10998

110-
For _nginx-gateway_ container, you can `grep` for the word `error` or change the log level to `error` by following steps in [Modify logging levels](#modify-logging-levels). Once you have modified log levels, you can `grep` for the word `debug` to check debug logs for further investigation.
99+
To see error logs for control plane and data plane containers:
100+
101+
For _nginx-gateway_ container, you can `grep` for the word `error` or change the log level to `error` by following steps in [Modify log levels](#modify-log-levels). Once you have modified log levels, you can `grep` for the word `debug` to check debug logs for further investigation.
111102

112103
```shell
113104
kubectl -n nginx-gateway logs <ngf-pod-name> -c nginx-gateway | grep error
@@ -134,79 +125,16 @@ kubectl logs -n nginx-gateway ngf-nginx-gateway-fabric-bb8598998-jwk2m -c nginx
134125
2024/06/13 20:04:17 [emerg] 27#27: too long parameter, probably missing terminating """ character in /etc/nginx/conf.d/http.conf:78
135126
```
136127

137-
1. NGINX access logs are files that record all requests processed by the NGINX server. These logs provide detailed information about each request, which can be useful for troubleshooting, and analyzing web traffic.
138-
To view the access logs, get shell access to your NGINX container using the [steps](#get-shell-access-to-nginx-container). The access logs are located in the file `/var/log/nginx/access.log` in the NGINX container.
139-
Another method to check access logs is by reviewing the container logs for _nginx_ using:
128+
##### Access Logs
140129

141-
```shell
142-
kubectl logs -n nginx-gateway <ngf-pod-name> -c nginx
143-
```
130+
NGINX access logs are files that record all requests processed by the NGINX server. These logs provide detailed information about each request, which can be useful for troubleshooting, and analyzing web traffic.
131+
To view the access logs, get shell access to your NGINX container using the [steps](#get-shell-access-to-nginx-container). The access logs are located in the file `/var/log/nginx/access.log` in the NGINX container and present in logs for _nginx_ container.
144132

145133
You can see logs for a crashed or killed container by adding the `-p` flag to the above commands.
146134

147-
##### NGINX Gateway Fabric Pod is not running or ready
148-
149-
To understand why the NGINX Gateway Fabric Pod has not started running or is not ready, the first step is to check the state of the Pod to get detailed information about the current status and events happening in the Pod. To do this, use `kubectl describe`:
150-
151-
```shell
152-
kubectl describe pod <ngf-pod-name> -n nginx-gateway
153-
```
154-
155-
The Pod description includes details about the image name, tags, current status, and environment variables. Verify that these details match your setup and cross-check with the events to ensure everything is functioning as expected. For example, the Pod below has two containers that are running and the events reflect the same.
156-
157-
```text
158-
Containers:
159-
nginx-gateway:
160-
Container ID: containerd://06c97a9de938b35049b7c63e251418395aef65dd1ff996119362212708b79cab
161-
Image: nginx-gateway-fabric
162-
Image ID: docker.io/library/import-2024-06-13@sha256:1460d63bd8a352a6e455884d7ebf51ce9c92c512cb43b13e44a1c3e3e6a08918
163-
Ports: 9113/TCP, 8081/TCP
164-
Host Ports: 0/TCP, 0/TCP
165-
State: Running
166-
Started: Thu, 13 Jun 2024 11:47:46 -0600
167-
Ready: True
168-
Restart Count: 0
169-
Readiness: http-get http://:health/readyz delay=3s timeout=1s period=1s #success=1 #failure=3
170-
Environment:
171-
POD_IP: (v1:status.podIP)
172-
POD_NAMESPACE: nginx-gateway (v1:metadata.namespace)
173-
POD_NAME: ngf-nginx-gateway-fabric-66dd665756-zh7d7 (v1:metadata.name)
174-
nginx:
175-
Container ID: containerd://c2f3684fd8922e4fac7d5707ab4eb5f49b1f76a48893852c9a812cd6dbaa2f55
176-
Image: nginx-gateway-fabric/nginx
177-
Image ID: docker.io/library/import-2024-06-13@sha256:c9a02cb5665c6218373f8f65fc2c730f018d0ca652ae827cc913a7c6e9db6f45
178-
Ports: 80/TCP, 443/TCP
179-
Host Ports: 0/TCP, 0/TCP
180-
State: Running
181-
Started: Thu, 13 Jun 2024 11:47:46 -0600
182-
Ready: True
183-
Restart Count: 0
184-
Environment: <none>
185-
Events:
186-
Type Reason Age From Message
187-
---- ------ ---- ---- -------
188-
Normal Scheduled 40s default-scheduler Successfully assigned nginx-gateway/ngf-nginx-gateway-fabric-66dd665756-zh7d7 to kind-control-plane
189-
Normal Pulled 40s kubelet Container image "nginx-gateway-fabric" already present on machine
190-
Normal Created 40s kubelet Created container nginx-gateway
191-
Normal Started 39s kubelet Started container nginx-gateway
192-
Normal Pulled 39s kubelet Container image "nginx-gateway-fabric/nginx" already present on machine
193-
Normal Created 39s kubelet Created container nginx
194-
Normal Started 39s kubelet Started container nginx
195-
```
196-
197-
198-
### Modify logging levels
199-
200-
To debug NGINX Gateway Fabric, enable verbose logging by editing the `NginxGateway` configuration. This can be done either before or after deploying NGINX Gateway Fabric. Refer to this [guide](https://docs.nginx.com/nginx-gateway-fabric/how-to/configuration/control-plane-configuration) to do so.
201-
202-
### NGINX fails to reload
203-
204-
#### Description
135+
##### Modify Log Levels
205136

206-
NGINX reload errors can occur for various reasons, including syntax errors in configuration files, permission issues, and more. To determine if NGINX has failed to reload, check logs for your _nginx-gateway_ and _nginx_ containers.
207-
You will see the following error in the _nginx-gateway_ logs `failed to reload NGINX:` followed by the reason for the failure. Similarly, error logs in _nginx_ container start with `emerg`. For example, `2024/06/12 14:25:11 [emerg] 12345#0: open() "/var/run/nginx.pid" failed (13: Permission denied)` shows a critical error, such as a permission problem preventing NGINX from accessing necessary files.
208-
209-
To debug why your reload has failed, start with verifying the syntax of your configuration files by opening a shell in the NGINX container following these [steps](#get-shell-access-to-nginx-container) and running `nginx -T`. If there are errors in your configuration file, the reload will fail and specify why it has failed.
137+
To see debug logs for control plane in NGINX Gateway Fabric, enable verbose logging by editing the `NginxGateway` configuration. This can be done either before or after deploying NGINX Gateway Fabric. Refer to this [guide](https://docs.nginx.com/nginx-gateway-fabric/how-to/configuration/control-plane-configuration) to do so.
210138

211139
### Understanding the generated NGINX config
212140

@@ -348,7 +276,9 @@ Handling connection for 8080
348276
</body>
349277
```
350278

351-
**Warning** The configuration may change in future releases.
279+
{{< caution >}}
280+
The configuration may change in future releases. This configuration is valid for version 1.3.
281+
{{< /caution >}}
352282

353283
#### Metrics for Troubleshooting
354284

@@ -365,6 +295,72 @@ Note that, the port number i.e `8080` matches the port number you have port-forw
365295

366296
### Common Errors
367297

298+
{{< bootstrap-table "table table-striped table-bordered" >}}
299+
| Problem Area | Symptom | Troubleshooting Method | Common Cause |
300+
|------------------------------|----------------------------------------|---------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------|
301+
| Startup | NGINX Gateway Fabric fails to start. | Check logs for _nginx_ and _nginx-gateway_ container. | Missing TLS key and cert for SSL servers. |
302+
| Resources not configured | Status missing on resources. | Check referenced resources. | Referenced resources not belong to Gateway Fabric. |
303+
| NGINX errors | Reload failures on NGINX | Fix permissions for control plane. | Security context not configured. |
304+
| Usage reporting | Errors logs related to usage reporting | Enable usage reporting. Refer to this [Usage Reporting]({{< relref "installation/usage-reporting.md" >}}) | Usage reporting disabled. |
305+
{{< /bootstrap-table >}}
306+
307+
##### NGINX fails to reload
308+
309+
NGINX reload errors can occur for various reasons, including syntax errors in configuration files, permission issues, and more. To determine if NGINX has failed to reload, check logs for your _nginx-gateway_ and _nginx_ containers.
310+
You will see the following error in the _nginx-gateway_ logs `failed to reload NGINX:` followed by the reason for the failure. Similarly, error logs in _nginx_ container start with `emerg`. For example, `2024/06/12 14:25:11 [emerg] 12345#0: open() "/var/run/nginx.pid" failed (13: Permission denied)` shows a critical error, such as a permission problem preventing NGINX from accessing necessary files.
311+
312+
To debug why your reload has failed, start with verifying the syntax of your configuration files by opening a shell in the NGINX container following these [steps](#get-shell-access-to-nginx-container) and running `nginx -T`. If there are errors in your configuration file, the reload will fail and specify the reason for it.
313+
314+
##### NGINX Gateway Fabric Pod is not running or ready
315+
316+
To understand why the NGINX Gateway Fabric Pod has not started running or is not ready, the first step is to check the state of the Pod to get detailed information about the current status and events happening in the Pod. To do this, use `kubectl describe`:
317+
318+
```shell
319+
kubectl describe pod <ngf-pod-name> -n nginx-gateway
320+
```
321+
322+
The Pod description includes details about the image name, tags, current status, and environment variables. Verify that these details match your setup and cross-check with the events to ensure everything is functioning as expected. For example, the Pod below has two containers that are running and the events reflect the same.
323+
324+
```text
325+
Containers:
326+
nginx-gateway:
327+
Container ID: containerd://06c97a9de938b35049b7c63e251418395aef65dd1ff996119362212708b79cab
328+
Image: nginx-gateway-fabric
329+
Image ID: docker.io/library/import-2024-06-13@sha256:1460d63bd8a352a6e455884d7ebf51ce9c92c512cb43b13e44a1c3e3e6a08918
330+
Ports: 9113/TCP, 8081/TCP
331+
Host Ports: 0/TCP, 0/TCP
332+
State: Running
333+
Started: Thu, 13 Jun 2024 11:47:46 -0600
334+
Ready: True
335+
Restart Count: 0
336+
Readiness: http-get http://:health/readyz delay=3s timeout=1s period=1s #success=1 #failure=3
337+
Environment:
338+
POD_IP: (v1:status.podIP)
339+
POD_NAMESPACE: nginx-gateway (v1:metadata.namespace)
340+
POD_NAME: ngf-nginx-gateway-fabric-66dd665756-zh7d7 (v1:metadata.name)
341+
nginx:
342+
Container ID: containerd://c2f3684fd8922e4fac7d5707ab4eb5f49b1f76a48893852c9a812cd6dbaa2f55
343+
Image: nginx-gateway-fabric/nginx
344+
Image ID: docker.io/library/import-2024-06-13@sha256:c9a02cb5665c6218373f8f65fc2c730f018d0ca652ae827cc913a7c6e9db6f45
345+
Ports: 80/TCP, 443/TCP
346+
Host Ports: 0/TCP, 0/TCP
347+
State: Running
348+
Started: Thu, 13 Jun 2024 11:47:46 -0600
349+
Ready: True
350+
Restart Count: 0
351+
Environment: <none>
352+
Events:
353+
Type Reason Age From Message
354+
---- ------ ---- ---- -------
355+
Normal Scheduled 40s default-scheduler Successfully assigned nginx-gateway/ngf-nginx-gateway-fabric-66dd665756-zh7d7 to kind-control-plane
356+
Normal Pulled 40s kubelet Container image "nginx-gateway-fabric" already present on machine
357+
Normal Created 40s kubelet Created container nginx-gateway
358+
Normal Started 39s kubelet Started container nginx-gateway
359+
Normal Pulled 39s kubelet Container image "nginx-gateway-fabric/nginx" already present on machine
360+
Normal Created 39s kubelet Created container nginx
361+
Normal Started 39s kubelet Started container nginx
362+
```
363+
368364
##### Insufficient Privileges errors
369365

370366
Depending on your environment's configuration, the control plane may not have the proper permissions to reload NGINX. The NGINX configuration will not be applied and you will see the following error in the _nginx-gateway_ logs:

0 commit comments

Comments
 (0)