Skip to content

WIP Expose ra counters #13895

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 14 commits into
base: main
Choose a base branch
from
Draft

WIP Expose ra counters #13895

wants to merge 14 commits into from

Conversation

mkuratczyk
Copy link
Contributor

@mkuratczyk mkuratczyk commented May 15, 2025

WORK IN PROGRESS

use prometheus-support branch of seshat and some export-ra-counters branch of Ra to export Ra counters from RabbitMQ's Prometheus endpoint. For testing purposes, these metrics are available from /metrics/raft so we can compare them with the old ra_metrics returned from /metrics/per-object.

TODO:

  • support for prometheus.return_per_object_metrics = true

@mergify mergify bot added the make label May 15, 2025
@mkuratczyk mkuratczyk force-pushed the expose-ra-counters branch 10 times, most recently from ee03510 to 90103c8 Compare May 22, 2025 09:24
@mkuratczyk mkuratczyk force-pushed the expose-ra-counters branch from 8d84b64 to 35babf6 Compare May 23, 2025 10:22
mkuratczyk added 14 commits May 26, 2025 12:08
We can re-work global counters later. For now, this should be enough for
them to keep working and passing tests.
The _total suffix never made sense...
To be investigated
This is a bugfix really, not specific to Ra counters
I'd consider this a bugfix - until now, /metrics/per-objects
returned more metrics than /metrics when `return_per_object_metrics`
was `true`. I'd expect exactly the same metrics in both cases
For aggregated metrics, we just pick specific metrics (currently
num_segments and commit_latency) and only publish the maximum value,
without labels (`max_` is added to the metric name). For example:

```
> curl -s localhost:15692/metrics/per-object | rg -e ^rabbitmq_raft_num_segments -e ^rabbitmq_raft_commit_latency
rabbitmq_raft_commit_latency_seconds{module="rabbit_khepri",ra_system="coordination"} 0.0
rabbitmq_raft_commit_latency_seconds{queue="qq2",vhost="/"} 0.02
rabbitmq_raft_commit_latency_seconds{queue="qqq-1",vhost="/"} 0.01
rabbitmq_raft_commit_latency_seconds{queue="qqq-2",vhost="/"} 0.0
rabbitmq_raft_num_segments{module="rabbit_khepri",ra_system="coordination"} 1.0
rabbitmq_raft_num_segments{queue="qq2",vhost="/"} 132.0
rabbitmq_raft_num_segments{queue="qqq-2",vhost="/"} 245.0

> curl -s localhost:15692/metrics/ | rg ^rabbitmq_raft_max
rabbitmq_raft_max_commit_latency_seconds 0.02
rabbitmq_raft_max_num_segments 245.0
```
@mkuratczyk mkuratczyk force-pushed the expose-ra-counters branch from 35babf6 to 2a6fe6d Compare May 26, 2025 10:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant