The first fix lasted 90 seconds. We had corrected the Grafana datasource URL from prometheus:9999 back to prometheus:9090, watched the pod roll, refreshed the dashboard, and seen one panel come alive. By the time we opened a second tab, the ConfigMap was back to 9999. That was the real incident. The 'No Data' dashboards were a symptom of an observability stack that someone, or something, was actively re-corrupting from at least seven places we had not yet found.
Problem signals:
Grafana dashboards show 'No Data' on every panel after a cluster migration, and kubectl edit fixes revert within 1-3 minutes
Prometheus targets page is empty or stuck on a namespace that does not exist anymore
ClusterRoleBindings you just recreated reference a ClusterRole name nobody on the team typed















