Skip to content

set up a status page for registry #182

Closed
Closed
@matifali

Description

@matifali

Problem Description

The registry’s status page at coder.instatus.com needs enhancements for SaaS-grade reliability and transparency. Current monitoring relies on a basic 404-checking script. With the registry now deployed on Google Cloud Run and Instatus supporting Prometheus and Google Cloud integrations, this issue focuses on modernizing monitoring, alerting, and user access.


Desired Solution

Integration

  • Add a prominent link to the status page in the registry footer.
  • Update status categories (e.g., uptime, latency, etc) for better user insights. (Nice to have)

Monitoring

  • Use Google Cloud Run metrics to track container health, CPU and memory usage, latency, etc. feeding data into Prometheus.
  • Integrate Prometheus with Instatus for real-time metrics (e.g., success rates, errors, latency). (Nice to have)

Alerting

  • Configure multi-channel alerts (email, Slack, PagerDuty) with thresholds for high(outage)/low(e.g. higher latency) priority issues.
  • Ensure alerts are logged and visible on the status page for incident transparency.

Documentation

  • Provide a playbook for handling alerts and troubleshooting incidents.
  • Document Prometheus, Cloud Run, and Instatus setup steps.

Definition of Done

  • Status page displays real-time, accurate health data and is linked in the registry UI.
  • Monitoring (Prometheus + Cloud Run) and alerting pipelines are fully tested.
  • All setup and maintenance documentation is complete and accessible.

Metadata

Metadata

Labels

docsImprovements or additions to documentation

Type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions