-
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Monitor disk status #224
Comments
I've reconfigured grafana helm chart to export mdadm metrics and push them to prometheus. These metrics are now available in our Grafana Cloud instance. I've configured a minimal dashboard (based on an existing one tbh). Not sure it has everything / it is the most convenient one, but at least it has the most important information. https://kiwixorg.grafana.net/d/edu6v6ekri77kd/mdadm I've updated the weekly routine to check this dashboard. |
I reopen because I've done only the RAID part, we still need to check smart status (at least) |
And we want to monitor/check for non-nodes as well |
My proposition follows. Every week:
Every month:
Every year (we can probably include it in monthly routine and say "run it only once a year in January"):
Nota: pretty easy to automate looping over disks with something like:
|
At the moment, we don't monitor machine disk status: RAID arrays status, SMART status, etc.
We don't want/need to integrate it to grafana or even automatically upload information but we can at least add some checks to the routines so we're not completely blind on such problems should they occur.
The text was updated successfully, but these errors were encountered: