Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add alert if an OpenSearch scrape fails #507

Merged
merged 2 commits into from
Nov 27, 2024

Commits on Nov 26, 2024

  1. Add alert if an OpenSearch scrape fails

    If a scrape fails, this might indicate that a unit is not in
    a healthy state.
    
    OpenSearch right now does not have a metric saying that one node
    is down. E.g. If the systemd service is stopped in one node, the
    cluster (N nodes) will drop the faulty node because connectivity
    issues and the metrics will show that the cluster now has N-1 nodes
    without saying that one node has failed.
    
    With this new alert, at least a notification will appear if one
    node stop being responsive.
    gabrielcocenza committed Nov 26, 2024
    Configuration menu
    Copy the full SHA
    454d363 View commit details
    Browse the repository at this point in the history

Commits on Nov 27, 2024

  1. remove filter

    grafana-agent already inject the juju topology, so it's not
    necessary to filter by jobs or application
    gabrielcocenza committed Nov 27, 2024
    Configuration menu
    Copy the full SHA
    63c451a View commit details
    Browse the repository at this point in the history