Skip to content

Add data plane provisioner to support conformance tests #634

Closed
@pleshakov

Description

@pleshakov

Parent issue -- #305

Implementation PRs (the issue requires multiple PRs)

Overview

This issue introduces a new component -- the provisioner -- to unblock us from not running the conformance tests (short-term) and to take care of data plane management for our users (long term).

Background

The Gateway API prescribes that Gateway implementations support more than one Gateway resources: for every deployed Gateway resources, the implementation must process it and configure the data plane accordingly. This is what implicitly assumed in the Gateway API conformance tests.

Our implementation doesn't support multiple Gateways, which makes it a blocker for running conformance tests.

We have considered a few approaches to unblock the conformance tests:

  • Modify the conformance tests so that they don't require the support for multiple Gateways. The discussion happened here and in the community meetings. Our proposals didn't meet enough support to make changes to the tests.
  • Modify NKG so that it supports multiple Gateways:
    • Merging Gateways. With this approach, NKG merges the listeners from all Gateways. We don't pursue this approach because (1) merging is not well defined by the Gateway API spec yet and (2) the listeners in the conformance tests are conflicting (from the point of a single Gateway) -- same port-protocol-hostname listeners appear across multiple Gateways.
    • Merging Gateways in data plane configuration, in which NKG processes Gateways independently but then it generates a shared data plane configuration for all Gateways which resolves any conflicts. This approach was suggested here. We don't pursue this approach because while it can unblock the conformance tests, it allows conflicting routing rules and listeners propagate to the data plane configurations.
    • Logical data planes configuration in a single data plane configuration. With this approach (1) each Gateway gets dedicated ports in data plane configuration non-overlapping with the ports of the other Gateways and (2) NKG provisions a dedicated Service for each Gateway so that the logical data plane can be accessed by the clients via a dedicated IP. This approach was prototyped here. We don't pursue it because (a) it requires significant changes to how NKG generates NGINX configuration and (b) we're not confident we want to support multiple Gateways this way going forward for production use.
    • Data plane provisioner. Where a separate component -- the provisioner -- provisions an NKG deployment for each Gateway. We choose this approach because (a) most of the other Gateway API implementations went this way (b) we believe there will be demand for it in the future from our users (c) at the same time, it can be implemented in an isolation from the core NKG code so we can easily get rid of it later if we change our course. This approach was prototyped here.

High-Level Requirements

We introduce a new component -- the provisioner, which will take care of provisioning NKG deployments for Gateways of the configured GatewayClass. It will be a Kubernetes controller that watches for changes in Gateway resources that belong to its GatewayClass and ensures that for each Gateway, a corresponding NKG deployment is running in the cluster. The provisioner will own the GatewayClass resource -- it will report its status.

At the same, NKG gets extended so that:

  • It supports processing a specific Gateway for which it was provisioned by the provisioner.
  • It supports disabling reporting the status of the GatewayClass resource.

Detailed Requirements

Important note -- the existing way of handling multiple Gateways is preserved and remains the recommended one. The provisioner is introduced only to unblock us from not running the conformance test and is not recommended for production use (for now).

We introduce the provisioner in the same binary as NKG, so that we don't need to build a separate container. As a result, the binary will have two modes (commands):

  • Provisioner.
  • NKG - what we have now with a few extensions.

Provisioner

  • Processes a specific GatewayClass resource (its name configured via a cli argument along with the controller name that must match the controllerName field of the GatewayClass resource) and reports its status.
  • Processes Gateways that belong to its Gateway class. For each Gateway resource:
    • on Create: Create an NKG deployment with one replica in the namespace where the provisioner is running.
    • on Update: Do nothing.
    • on Delete: Delete the corresponding deployment.
  • Doesn't report the status of Gateway resources - the corresponding NKG will do that.
  • Includes a warning in its log that it is not recommended to be used in production.
  • Resides in the same binary as NKG so that we don't need to build a separate container.
  • All NKG deployments share the same SA, ConfigMaps and other installation resources, only the deployments are different. Creating those resources will be a prerequisite to use the provisioner.

NKG

  • Supports configuring a single Gateway resource (via a cli arg) to watch.
  • Supports not reporting the GatewayClass status, so that it doesn't conflict with the provisioner.

Out of Scope

For now, we will not address many concerns that will be require for production use of the provisioner:

  • High availability.
  • Robustness. No user input (Gateway API resources) validation unless it is needed for the conformance tests.
  • Performance.
  • Configurability. No need to support customization the provisioned NKG deployments.
  • Reliability. Handling restarts and failures is not necessary. Terminating for error handling is acceptable.

Note: However, while the scope is reduced, we will not compromise on the software quality, including design, implementation and testing.

Note: We will leave the option to re-evaluate if supporting multiple Gateways via the provisioner is an optimal approach and will remove it if we found a better one.

Acceptance Criteria

  • The provisioner is introduced and NKG is extended to support the requirements above.
  • The new command line arguments are documented. However, the installation of the provisioner is not documented in the installation docs to discourage the usage outside of development.
  • A new folder is introduced in the repo for developer docs -- docs for the developers of the NKG project. Add the provisioner installation and running instructions there.

Metadata

Metadata

Assignees

Labels

area/control-planeGeneral control plane issuesconformanceRelates to passing Gateway API conformance testsrefinedRequirements are refined and the issue is ready to be implemented.

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions