Skip to content

Commit

Permalink
Add a new metric named ntp_server_reachable
Browse files Browse the repository at this point in the history
Transform the high drift from a constant to a parameter
Complete the README

add metric

Add metric

add metric to the collector.
Send it everytime the server was reache successfully at least once

correct matric name

correct metric name

make the highDrift constant a parameter

add the possibility to configure the highDrift threshold in http mode
Add doc

delete log statement + correction after linting

Change from 1 to 0 for the metric value

After discussing with the SAPCC team, it seems better to have a 0 there. We should either make our NTP server more forgiving or reduce the measurement duration if the NTP server close our connection.

Run go-makefile-maker

Renovate: Update github.com/sapcc/go-bits digest to f061229

Run go-makefile-maker

Renovate: Update github.com/sapcc/go-bits digest to 364c083
  • Loading branch information
Rafouf69 committed Mar 13, 2024
1 parent 8f538da commit 61b329c
Show file tree
Hide file tree
Showing 67 changed files with 1,097 additions and 284 deletions.
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM golang:1.22.0-alpine3.19 as builder
FROM golang:1.22.1-alpine3.19 as builder

RUN apk add --no-cache --no-progress ca-certificates gcc git make musl-dev

Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ run-golangci-lint: FORCE prepare-static-check

build/cover.out: FORCE | build
@printf "\e[1;36m>> Running tests\e[0m\n"
@env $(GO_TESTENV) go test -shuffle=on $(GO_BUILDFLAGS) -ldflags '-s -w -X github.com/sapcc/go-api-declarations/bininfo.binName=ntp_exporter -X github.com/sapcc/go-api-declarations/bininfo.version=$(BININFO_VERSION) -X github.com/sapcc/go-api-declarations/bininfo.commit=$(BININFO_COMMIT_HASH) -X github.com/sapcc/go-api-declarations/bininfo.buildDate=$(BININFO_BUILD_DATE) $(GO_LDFLAGS)' -p 1 -coverprofile=$@ -covermode=count -coverpkg=$(subst $(space),$(comma),$(GO_COVERPKGS)) $(GO_TESTPKGS)
@env $(GO_TESTENV) go test -shuffle=on -p 1 -coverprofile=$@ $(GO_BUILDFLAGS) -ldflags '-s -w -X github.com/sapcc/go-api-declarations/bininfo.binName=ntp_exporter -X github.com/sapcc/go-api-declarations/bininfo.version=$(BININFO_VERSION) -X github.com/sapcc/go-api-declarations/bininfo.commit=$(BININFO_COMMIT_HASH) -X github.com/sapcc/go-api-declarations/bininfo.buildDate=$(BININFO_BUILD_DATE) $(GO_LDFLAGS)' -covermode=count -coverpkg=$(subst $(space),$(comma),$(GO_COVERPKGS)) $(GO_TESTPKGS)

build/cover.html: build/cover.out
@printf "\e[1;36m>> go tool cover > build/cover.html\e[0m\n"
Expand Down
12 changes: 11 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ These are the metrics supported.
- `ntp_precision_seconds`
- `ntp_leap`
- `ntp_scrape_duration_seconds`
- `ntp_server_reachable`

As an alternative to [the node-exporter's `time` module](https://github.com/prometheus/node_exporter/blob/master/docs/TIME.md), this exporter does not require an NTP component on localhost that it can talk to. We only look at the system clock and talk to the configured NTP server(s).

Expand Down Expand Up @@ -60,12 +61,20 @@ and connection options is defined by command-line options:
```
-ntp.measurement-duration duration
Duration of measurements in case of high (>10ms) drift. (default 30s)
-ntp.high-drift duration
High drift threshold. (default 10ms)
-ntp.protocol-version int
NTP protocol version to use. (default 4)
-ntp.server string
NTP server to use (required).
```

Command-line usage example:

```sh
ntp_exporter -ntp.server ntp.example.com -web.telemetry-path "/probe" -ntp.measurement-duration "5s" -ntp.high-drift "50ms"
```

### Mode 2: Variable NTP server

When the option `-ntp.source http` is specified, the NTP server and connection
Expand All @@ -75,11 +84,12 @@ request:
- `target`: NTP server to use
- `protocol`: NTP protocol version (2, 3 or 4)
- `duration`: duration of measurements in case of high drift
- `high-drift`: High drift threshold to trigger multiple probing

For example:

```sh
$ curl 'http://localhost:9559/metrics?target=ntp.example.com&protocol=4&duration=10s'
$ curl 'http://localhost:9559/metrics?target=ntp.example.com&protocol=4&duration=10s&high-drift=100ms'
```

## Frequently asked questions (FAQ)
Expand Down
23 changes: 18 additions & 5 deletions collector.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,12 @@ import (
"github.com/prometheus/client_golang/prometheus"
)

func CollectorInitial(target string, protocol int, duration time.Duration) Collector {
func CollectorInitial(target string, protocol int, duration, highDrift time.Duration) Collector {
return Collector{
NtpServer: target,
NtpProtocolVersion: protocol,
NtpMeasurementDuration: duration,
NtpHighDrift: highDrift,
buildInfo: prometheus.NewGaugeFunc(prometheus.GaugeOpts{
Namespace: "ntp",
Name: "build_info",
Expand Down Expand Up @@ -91,6 +92,11 @@ func CollectorInitial(target string, protocol int, duration time.Duration) Colle
Name: "scrape_duration_seconds",
Help: "ntp_exporter: Duration of a scrape job.",
}),
serverReachable: prometheus.NewGaugeVec(prometheus.GaugeOpts{
Namespace: "ntp",
Name: "server_reachable",
Help: "True if the NTP server is reachable by the NTP exporter.",
}, []string{"server"}),
}
}

Expand All @@ -99,6 +105,7 @@ type Collector struct {
NtpServer string
NtpProtocolVersion int
NtpMeasurementDuration time.Duration
NtpHighDrift time.Duration
buildInfo prometheus.GaugeFunc
stratum *prometheus.GaugeVec
drift *prometheus.GaugeVec
Expand All @@ -110,6 +117,7 @@ type Collector struct {
precision *prometheus.GaugeVec
leap *prometheus.GaugeVec
scrapeDuration prometheus.Summary
serverReachable *prometheus.GaugeVec
}

// A single measurement returned by ntp server
Expand Down Expand Up @@ -138,11 +146,15 @@ func (c Collector) Describe(ch chan<- *prometheus.Desc) {
c.precision.Describe(ch)
c.leap.Describe(ch)
c.scrapeDuration.Describe(ch)
c.serverReachable.Describe(ch)
}

// Collect implements the prometheus.Collector interface.
func (c Collector) Collect(ch chan<- prometheus.Metric) {
err := c.measure()

c.serverReachable.Collect(ch)

//only report data when measurement was successful
if err == nil {
c.buildInfo.Collect(ch)
Expand All @@ -163,17 +175,16 @@ func (c Collector) Collect(ch chan<- prometheus.Metric) {
}

func (c Collector) measure() error {
const highDrift = 0.01

begin := time.Now()
measurement, err := c.getClockOffsetAndStratum()

if err != nil {
c.serverReachable.WithLabelValues(c.NtpServer).Set(0)
return fmt.Errorf("couldn't get NTP measurement: %w", err)
}

//if clock drift is unusually high (e.g. >10ms): repeat measurements for 30 seconds and submit median value
if measurement.clockOffset > highDrift {
if measurement.clockOffset > c.NtpHighDrift.Seconds() {
//arrays of measurements used to calculate median
var measurementsClockOffset []float64
var measurementsStratum []float64
Expand All @@ -185,10 +196,11 @@ func (c Collector) measure() error {
var measurementsPrecision []float64
var measurementsLeap []float64

log.Printf("WARN: clock drift is above %.2fs, taking multiple measurements for %.2f seconds", highDrift, c.NtpMeasurementDuration.Seconds())
log.Printf("WARN: clock drift is above %.3fs, taking multiple measurements for %.2f seconds", c.NtpHighDrift.Seconds(), c.NtpMeasurementDuration.Seconds())
for time.Since(begin) < c.NtpMeasurementDuration {
nextMeasurement, err := c.getClockOffsetAndStratum()
if err != nil {
c.serverReachable.WithLabelValues(c.NtpServer).Set(0)
return fmt.Errorf("couldn't get NTP measurement: %w", err)
}

Expand Down Expand Up @@ -223,6 +235,7 @@ func (c Collector) measure() error {
c.rootDistance.WithLabelValues(c.NtpServer).Set(measurement.rootDistance)
c.precision.WithLabelValues(c.NtpServer).Set(measurement.precision)
c.leap.WithLabelValues(c.NtpServer).Set(measurement.leap)
c.serverReachable.WithLabelValues(c.NtpServer).Set(1)

c.scrapeDuration.Observe(time.Since(begin).Seconds())
return nil
Expand Down
14 changes: 7 additions & 7 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -4,19 +4,19 @@ go 1.22

require (
github.com/beevik/ntp v1.3.1
github.com/prometheus/client_golang v1.18.0
github.com/prometheus/client_golang v1.19.0
github.com/sapcc/go-api-declarations v1.10.9
github.com/sapcc/go-bits v0.0.0-20240222221204-90b493ffdee9
github.com/sapcc/go-bits v0.0.0-20240307080654-364c083fcdf1
go.uber.org/automaxprocs v1.5.3
)

require (
github.com/beorn7/perks v1.0.1 // indirect
github.com/cespare/xxhash/v2 v2.2.0 // indirect
github.com/prometheus/client_model v0.5.0 // indirect
github.com/prometheus/common v0.47.0 // indirect
github.com/prometheus/client_model v0.6.0 // indirect
github.com/prometheus/common v0.49.0 // indirect
github.com/prometheus/procfs v0.12.0 // indirect
golang.org/x/net v0.20.0 // indirect
golang.org/x/sys v0.16.0 // indirect
google.golang.org/protobuf v1.32.0 // indirect
golang.org/x/net v0.21.0 // indirect
golang.org/x/sys v0.17.0 // indirect
google.golang.org/protobuf v1.33.0 // indirect
)
28 changes: 14 additions & 14 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -12,29 +12,29 @@ github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZb
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/prashantv/gostub v1.1.0 h1:BTyx3RfQjRHnUWaGF9oQos79AlQ5k8WNktv7VGvVH4g=
github.com/prashantv/gostub v1.1.0/go.mod h1:A5zLQHz7ieHGG7is6LLXLz7I8+3LZzsrV0P1IAHhP5U=
github.com/prometheus/client_golang v1.18.0 h1:HzFfmkOzH5Q8L8G+kSJKUx5dtG87sewO+FoDDqP5Tbk=
github.com/prometheus/client_golang v1.18.0/go.mod h1:T+GXkCk5wSJyOqMIzVgvvjFDlkOQntgjkJWKrN5txjA=
github.com/prometheus/client_model v0.5.0 h1:VQw1hfvPvk3Uv6Qf29VrPF32JB6rtbgI6cYPYQjL0Qw=
github.com/prometheus/client_model v0.5.0/go.mod h1:dTiFglRmd66nLR9Pv9f0mZi7B7fk5Pm3gvsjB5tr+kI=
github.com/prometheus/common v0.47.0 h1:p5Cz0FNHo7SnWOmWmoRozVcjEp0bIVU8cV7OShpjL1k=
github.com/prometheus/common v0.47.0/go.mod h1:0/KsvlIEfPQCQ5I2iNSAWKPZziNCvRs5EC6ILDTlAPc=
github.com/prometheus/client_golang v1.19.0 h1:ygXvpU1AoN1MhdzckN+PyD9QJOSD4x7kmXYlnfbA6JU=
github.com/prometheus/client_golang v1.19.0/go.mod h1:ZRM9uEAypZakd+q/x7+gmsvXdURP+DABIEIjnmDdp+k=
github.com/prometheus/client_model v0.6.0 h1:k1v3CzpSRUTrKMppY35TLwPvxHqBu0bYgxZzqGIgaos=
github.com/prometheus/client_model v0.6.0/go.mod h1:NTQHnmxFpouOD0DpvP4XujX3CdOAGQPoaGhyTchlyt8=
github.com/prometheus/common v0.49.0 h1:ToNTdK4zSnPVJmh698mGFkDor9wBI/iGaJy5dbH1EgI=
github.com/prometheus/common v0.49.0/go.mod h1:Kxm+EULxRbUkjGU6WFsQqo3ORzB4tyKvlWFOE9mB2sE=
github.com/prometheus/procfs v0.12.0 h1:jluTpSng7V9hY0O2R9DzzJHYb2xULk9VTR1V1R/k6Bo=
github.com/prometheus/procfs v0.12.0/go.mod h1:pcuDEFsWDnvcgNzo4EEweacyhjeA9Zk3cnaOZAZEfOo=
github.com/sapcc/go-api-declarations v1.10.9 h1:k+F3W0FTyYLazII4hdFaWLJ7L+MQRcFBD9I9P/hUNBs=
github.com/sapcc/go-api-declarations v1.10.9/go.mod h1:83R3hTANhuRXt/pXDby37IJetw8l7DG41s33Tp9NXxI=
github.com/sapcc/go-bits v0.0.0-20240222221204-90b493ffdee9 h1:rbg802tNP3trhxlzQCAQHHnd1UAfjiSB9UR7JqdRxTU=
github.com/sapcc/go-bits v0.0.0-20240222221204-90b493ffdee9/go.mod h1:w+u3Y/Dt7/1F/Km1skkratCVrexj5o0zes1Ds7aJFhE=
github.com/sapcc/go-bits v0.0.0-20240307080654-364c083fcdf1 h1:GpA29LYGZTmX279lMWyZTZhGCNnY4PI/ssjFcElUgnw=
github.com/sapcc/go-bits v0.0.0-20240307080654-364c083fcdf1/go.mod h1:/EltzUoec+W/6oEtfsvbPDs5vqgUzdfLwUFSxeL1WHg=
github.com/sergi/go-diff v1.3.1 h1:xkr+Oxo4BOQKmkn/B9eMK0g5Kg/983T9DqqPHwYqD+8=
github.com/sergi/go-diff v1.3.1/go.mod h1:aMJSSKb2lpPvRNec0+w3fl7LP9IOFzdc9Pa4NFbPK1I=
github.com/stretchr/testify v1.8.4 h1:CcVxjf3Q8PM0mHUKJCdn+eZZtm5yQwehR5yeSVQQcUk=
github.com/stretchr/testify v1.8.4/go.mod h1:sz/lmYIOXD/1dqDmKjjqLyZ2RngseejIcXlSw2iwfAo=
go.uber.org/automaxprocs v1.5.3 h1:kWazyxZUrS3Gs4qUpbwo5kEIMGe/DAvi5Z4tl2NW4j8=
go.uber.org/automaxprocs v1.5.3/go.mod h1:eRbA25aqJrxAbsLO0xy5jVwPt7FQnRgjW+efnwa1WM0=
golang.org/x/net v0.20.0 h1:aCL9BSgETF1k+blQaYUBx9hJ9LOGP3gAVemcZlf1Kpo=
golang.org/x/net v0.20.0/go.mod h1:z8BVo6PvndSri0LbOE3hAn0apkU+1YvI6E70E9jsnvY=
golang.org/x/sys v0.16.0 h1:xWw16ngr6ZMtmxDyKyIgsE93KNKz5HKmMa3b8ALHidU=
golang.org/x/sys v0.16.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
google.golang.org/protobuf v1.32.0 h1:pPC6BG5ex8PDFnkbrGU3EixyhKcQ2aDuBS36lqK/C7I=
google.golang.org/protobuf v1.32.0/go.mod h1:c6P6GXX6sHbq/GpV6MGZEdwhWPcYBgnhAHhKbcUYpos=
golang.org/x/net v0.21.0 h1:AQyQV4dYCvJ7vGmJyKki9+PBdyvhkSd8EIx/qb0AYv4=
golang.org/x/net v0.21.0/go.mod h1:bIjVDfnllIU7BJ2DNgfnXvpSvtn8VRwhlsaeUTyUS44=
golang.org/x/sys v0.17.0 h1:25cE3gD+tdBA7lp7QfhuV+rJiE9YXTcS3VG1SqssI/Y=
golang.org/x/sys v0.17.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
google.golang.org/protobuf v1.33.0 h1:uNO2rsAINq/JlFpSdYEKIZ0uKD/R9cpdv0T+yoGwGmI=
google.golang.org/protobuf v1.33.0/go.mod h1:c6P6GXX6sHbq/GpV6MGZEdwhWPcYBgnhAHhKbcUYpos=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
12 changes: 11 additions & 1 deletion main.go
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ var (
ntpServer string
ntpProtocolVersion int
ntpMeasurementDuration time.Duration
ntpHighDrift time.Duration
ntpSource string
)

Expand Down Expand Up @@ -86,6 +87,7 @@ func init() {
flag.StringVar(&ntpServer, "ntp.server", "", "NTP server to use (required).")
flag.IntVar(&ntpProtocolVersion, "ntp.protocol-version", 4, "NTP protocol version to use.")
flag.DurationVar(&ntpMeasurementDuration, "ntp.measurement-duration", 30*time.Second, "Duration of measurements in case of high (>10ms) drift.")
flag.DurationVar(&ntpHighDrift, "ntp.high-drift", 10*time.Millisecond, "High drift threshold.")
flag.StringVar(&ntpSource, "ntp.source", "cli", "source of information about ntp server (cli / http).")
flag.Parse()
}
Expand All @@ -96,6 +98,7 @@ func handlerMetrics(w http.ResponseWriter, r *http.Request) {
s := ntpServer
p := ntpProtocolVersion
d := ntpMeasurementDuration
hd := ntpHighDrift

if ntpSource == "http" {
for _, i := range []string{"target", "protocol", "duration"} {
Expand Down Expand Up @@ -123,10 +126,17 @@ func handlerMetrics(w http.ResponseWriter, r *http.Request) {
http.Error(w, err.Error(), http.StatusBadRequest)
return
}

if u, err := time.ParseDuration(r.URL.Query().Get("high-drift")); err == nil {
hd = u
} else {
http.Error(w, err.Error(), http.StatusBadRequest)
return
}
}

registry := prometheus.NewRegistry()
registry.MustRegister(CollectorInitial(s, p, d))
registry.MustRegister(CollectorInitial(s, p, d, hd))
h := promhttp.HandlerFor(registry, promhttp.HandlerOpts{ErrorLog: logger})
h.ServeHTTP(w, r)
}
Expand Down
Loading

0 comments on commit 61b329c

Please sign in to comment.