Sports Telemetry Observability Platform
Project : Observability & SRE
Role : Platform Engineer
Stack : Prometheus Operator, Grafana, Loki, Tempo
Designed an observability stack that ingests, stores, and visualises low-latency match data for real-time decision making. Unified metrics, logs, and traces empower engineers and analysts to diagnose incidents in minutes.
- Deployed Prometheus Operator with Helmfile to manage more than 150 scrape configs.
- Integrated Loki and Tempo for correlated log and trace analysis within Grafana.
- Crafted alerting pipelines with Alertmanager and PagerDuty to reduce noise.