Skip to content

controller histogram buckets configuration #258

Closed
@terinjokes

Description

@terinjokes

Metrics are created for the controller's workqueue, using Prometheus's default buckets. Unfortunately, the default buckets are poorly chosen for event processing.

The default buckets are tailored to broadly measure the response time (in seconds) of a network service.

This can easily result in metrics that are extremely coarse. In a controller I was working on today, every single reconcile was faster than the smallest bucket:

controller_runtime_reconcile_time_seconds_bucket{controller="application",le="0.005"} 31
controller_runtime_reconcile_time_seconds_bucket{controller="application",le="0.01"} 31
controller_runtime_reconcile_time_seconds_bucket{controller="application",le="0.025"} 31
controller_runtime_reconcile_time_seconds_bucket{controller="application",le="0.05"} 31
controller_runtime_reconcile_time_seconds_bucket{controller="application",le="0.1"} 31
controller_runtime_reconcile_time_seconds_bucket{controller="application",le="0.25"} 31
controller_runtime_reconcile_time_seconds_bucket{controller="application",le="0.5"} 31
controller_runtime_reconcile_time_seconds_bucket{controller="application",le="1"} 31
controller_runtime_reconcile_time_seconds_bucket{controller="application",le="2.5"} 31
controller_runtime_reconcile_time_seconds_bucket{controller="application",le="5"} 31
controller_runtime_reconcile_time_seconds_bucket{controller="application",le="10"} 31
controller_runtime_reconcile_time_seconds_bucket{controller="application",le="+Inf"} 31

It is possible to change the default buckets by modifying DefBuckets in init, as my init will be called after the DefBuckets variable has been initialized, but before the controllermetrics package init. But this is a very heavy handed brush, changing the defaults of all histograms.

I propose two paths forward:

  1. Allow the user to pass their own metrics to the controller, with the current collectors used if the none are provided.
  2. Alternatively, move the current package outside of the internal package, This will allow users to call Unregister and then assign a replacement.

My preference would be the first option, as the change to the system is would be more easily understood.

Metadata

Metadata

Assignees

No one assigned

    Labels

    good first issueDenotes an issue ready for a new contributor, according to the "help wanted" guidelines.help wantedDenotes an issue that needs help from a contributor. Must meet "help wanted" guidelines.lifecycle/staleDenotes an issue or PR has remained open with no activity and has become stale.priority/awaiting-more-evidenceLowest priority. Possibly useful, but not yet enough support to actually get it done.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions