Open
2 of 2 issues completedDescription
As a user of NGF
I want the data and control planes separated
So that any security vulnerability or attack that compromises the control plane or vise verse does not affect the other plane.
and so that I can easily scale the number of data planes independently of the control planes in NGF in the future.
Background
As a route to efficacy and quickly understanding the Gateway API; its implementation and alignment to NGINX as a data plane, we decided on a simplified, but rigid, deployment pattern. To improve our security posture and installation flexibility the control and data planes should be separated as semi-autonomous, distributed components.
Problems
- Control plane containers and data plane containers compose into a single Kubernetes Pod.
- Control plane containers use OS signals and file system sharing to exchange data.
- Control plane and data plane are governed by the same RBAC policies as they reside in one Pod and ServiceAccount.
- Control plane and data plane must scale dependently and cannot scale on independent axis.
- Compromise of the control plane may impact customer traffic in the data plane.
- Compromise of the data plane may expose Kubernetes API server and impact the cluster and allow horizontal movement in the network.
- Kubernetes secrets and sensitive data will be shared across containers unnecessarily.
- Violation of a basic zero-trust tenet: "The data plane and control plane are logically separated." NIST SP 800-207
Acceptance Criteria
- NGINX Gateway Fabric's data and control planes exist independently on different pods within a Kubernetes deployment.
- Data plane scales automatically with each additional Gateway object present in the cluster.
- NGINX agent configuration is somehow present in Gateway API extensions
- Does not need to be the same extension
- User is able to configure their environment to send OTel metrics to NGINX One
Tasks
- Prototype for NGINX Agent Connection and Registration #1679
- Design for Data and Control Plane Separation #2655
- Update Upstreams with State Files for Dynamic Upstream Configuration #2841
- Remove NGINX container and runtime manager code from NGF deployment #2838
- Create NGINX Agent Docker Container #2839
- Temporarily Update Helm Chart to Deploy the Data Plane #2840
- PoC: NGINX Plus support with agent #2233
- Connect Agent to Control Plane #2851
- Update Data Plane Config via Agent #2842
- Control Leader Election with Agent #2850
- Manual Scale Test for Data and Control Plane Separation #3011
- Secure connection between agent and control plane #2843
- Add per-Gateway support to NginxProxy #2990
- Deploy the Data Plane with Control Plane #2844
- Duplicate NGINX Plus and docker registry secrets into namespaces where nginx is deployed #2845
- Support for Multiple NGF Gateways #1443
- Multiple Gateways - optimize reloading #3272
- Ensure NGF and nginx can deploy in OpenShift #3064
- Support NGINX Debug Mode when Provisioning Data Plane #3115
- Update Prometheus docs and Grafana dashboard #3065
- Update Non-Functional and Functional Tests for New Architecture #3010
- Functional Tests for Split Data and Control Plane Architecture #3116
- Update Conformance Tests with Data and Control Plane Split #2704
- Update existing docs to reflect new architecture #2819
- Product Telemetry for Number of Data Plane Pods #3118
- Support provisioning nginx as DaemonSet #3120
- Support FileStream to NGINX agent #3315
Questions for Discussion
- What needs to be included in the design? Does one already exist?
- Deployment architecture
- Communication channels / protocols
- Authentication and authorization
- Data plane registration and scaling
- Should data plane scale horizontally for one Gateway? How is can this be controlled by the user?
- How can we split the work to achieve the split?
Sub-issues
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
🏗 In Progress