Skip to content

Commit

Permalink
chore(otel): Open Telemetry metrics fixed and provided with demo exam…
Browse files Browse the repository at this point in the history
…ple (gnolang#3038)

* Redefine and cleanup Open Telemetry metrics types and usage
* Rehauled demo example adding a minimal sample of RCP + Validator Node
  • Loading branch information
sw360cab authored Nov 21, 2024
1 parent dc65f91 commit 549da06
Show file tree
Hide file tree
Showing 11 changed files with 262 additions and 281 deletions.
30 changes: 0 additions & 30 deletions gno.land/pkg/sdk/vm/handler.go
Original file line number Diff line number Diff line change
@@ -1,17 +1,12 @@
package vm

import (
"context"
"fmt"
"strings"

abci "github.com/gnolang/gno/tm2/pkg/bft/abci/types"
"github.com/gnolang/gno/tm2/pkg/sdk"
"github.com/gnolang/gno/tm2/pkg/std"
"github.com/gnolang/gno/tm2/pkg/telemetry"
"github.com/gnolang/gno/tm2/pkg/telemetry/metrics"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/metric"
)

type vmHandler struct {
Expand Down Expand Up @@ -107,34 +102,9 @@ func (vh vmHandler) Query(ctx sdk.Context, req abci.RequestQuery) abci.ResponseQ
secondPart(req.Path), req.Path)))
}

// Log the telemetry
logQueryTelemetry(path, res.IsErr())

return res
}

// logQueryTelemetry logs the relevant VM query telemetry
func logQueryTelemetry(path string, isErr bool) {
if !telemetry.MetricsEnabled() {
return
}

metrics.VMQueryCalls.Add(
context.Background(),
1,
metric.WithAttributes(
attribute.KeyValue{
Key: "path",
Value: attribute.StringValue(path),
},
),
)

if isErr {
metrics.VMQueryErrors.Add(context.Background(), 1)
}
}

// queryPackage fetch a package's files.
func (vh vmHandler) queryPackage(ctx sdk.Context, req abci.RequestQuery) (res abci.ResponseQuery) {
res.Data = []byte(fmt.Sprintf("TODO: parse parts get or make fileset..."))
Expand Down
35 changes: 20 additions & 15 deletions misc/telemetry/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
## Overview
# Open Telemetry overview

The purpose of this Telemetry documentation is to showcase the different node metrics exposed by the Gno node through
OpenTelemetry, without having to do extraneous setup.
Expand All @@ -8,39 +8,44 @@ The containerized setup is the following:
- Grafana dashboard
- Prometheus
- OpenTelemetry collector (separate service that needs to run)
- Single Gnoland node, with 1s block times and configured telemetry (enabled)
- 1 RPC Gnoland node, with 1s block times and configured telemetry (enabled)
- 1 Validator Gnoland node, with 1s block times and configured telemetry (enabled)
- Supernova process that simulates load periodically (generates network traffic)

## Metrics type

Metrics collected are defined within codebase at `tm2/pkg/telemetry/metrics/metrics.go`.
They are collected by the OTEL collector who forwards them to Prometheus.

They are of three different types which can be used in Grafana adding different ypt of suffixes to the metrics name :

- Histogram ("_sum", "_count", "_bucket"): Collect variations of values along time
- Gauge: Measure a single value at the time it is read
- Counter ("_total"): A value that accumulates over time

## Starting the containers

### Step 1: Spinning up Docker

Make sure you have Docker installed and running on your system. After that, within the `misc/telemetry` folder run the
following command:

```shell
```bash
make up
```

This will build out the required Docker images for this simulation, and start the services

### Step 2: Open Grafana

When you've verified that the `telemetry` containers are up and running, head on over to http://localhost:3000 to open
When you've verified that the `telemetry` containers are up and running, head on over to <http://localhost:3000> to open
the Grafana dashboard.

Default login details:

```
username: admin
password: admin
```

After you've logged in (you can skip setting a new password), on the left hand side, click on
`Dashboards -> Gno -> Gno Node Metrics`:
After you've logged in, on the left hand side, click on
`Dashboards -> Gno -> Gno Open Telemetry Metrics`:
![Grafana](assets/grafana-1.jpeg)

This will open up the predefined Gno Metrics dashboards (added for ease of use) :
This will open up the predefined Gno Metrics dashboards (added for ease of use):
![Metrics Dashboard](assets/grafana-2.jpeg)

Periodically, these metrics will be updated as the `supernova` process is simulating network traffic.
Expand All @@ -53,4 +58,4 @@ To stop the cluster, you can run:
make down
```

which will stop the Docker containers. Additionally, you can delete the Docker volumes with `make clean`.
which will stop the Docker containers. Additionally, you can delete the Docker volumes with `make clean`.
91 changes: 74 additions & 17 deletions misc/telemetry/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ services:
- ./collector/collector.yaml:/etc/otelcol-contrib/config.yaml
networks:
- gnoland-net

prometheus:
image: prom/prometheus:latest
command:
Expand All @@ -21,34 +22,90 @@ services:
- ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
networks:
- gnoland-net

grafana:
image: grafana/grafana-enterprise
image: grafana/grafana
environment:
- GF_AUTH_ANONYMOUS_ENABLED=true
- GF_AUTH_ANONYMOUS_ORG_ROLE=Admin
volumes:
- grafana_data:/var/lib/grafana
- ./grafana/datasources.yaml:/etc/grafana/provisioning/datasources/datasources.yaml
- ./grafana/dashboards.yaml:/etc/grafana/provisioning/dashboards/dashboards.yaml
- ./grafana/gno-dashboards.json:/var/lib/grafana/dashboards/gno-dashboards.json
- ./grafana/provisioning:/etc/grafana/provisioning
ports:
- "3000:3000"
networks:
- gnoland-net
gnoland:
build:
context: ./gnoland
dockerfile: Dockerfile
ports:
- "26657:26657"

gnoland-val:
image: ghcr.io/gnolang/gno/gnoland:master
networks:
- gnoland-net
volumes:
# Shared Volume
- gnoland-shared:/gnoroot/shared-data
entrypoint:
- sh
- -c
# Recreate gno genesis from git :(
- |
gnoland secrets init
rm -f /gnoroot/shared-data/node_p2p.id
apk add git make go linux-headers
git clone https://github.com/gnolang/gno.git --single-branch gnoland-src
GOPATH='/usr/' make -C gnoland-src/contribs/gnogenesis/
gnogenesis generate
gnogenesis validator add -name val000 -address $(gnoland secrets get validator_key.address -raw) -pub-key $(gnoland secrets get validator_key.pub_key -raw)
gnogenesis balances add -balance-sheet /gnoroot/gno.land/genesis/genesis_balances.txt
gnogenesis txs add packages /gnoroot/examples/gno.land
gnoland config init
gnoland config set consensus.timeout_commit 1s
gnoland config set moniker val000
gnoland config set telemetry.enabled true
gnoland config set telemetry.exporter_endpoint collector:4317
gnoland config set telemetry.service_instance_id val0
gnoland secrets get node_id.id -raw > /gnoroot/shared-data/node_p2p.id
cp /gnoroot/genesis.json /gnoroot/shared-data/genesis.json
gnoland start
healthcheck:
test: ["CMD-SHELL", "test -f /gnoroot/shared-data/node_p2p.id || exit 1"]
interval: 10s
timeout: 5s
retries: 5
start_period: 60s

gnoland-rpc:
image: ghcr.io/gnolang/gno/gnoland:master
networks:
- gnoland-net
volumes:
# Shared Volume
- gnoland-shared:/gnoroot/shared-data
entrypoint:
- sh
- -c
- |
gnoland secrets init
gnoland config init
gnoland config set consensus.timeout_commit 1s
gnoland config set moniker rpc0
gnoland config set rpc.laddr tcp://0.0.0.0:26657
gnoland config set telemetry.enabled true
gnoland config set telemetry.service_instance_id rpc000
gnoland config set telemetry.exporter_endpoint collector:4317
gnoland config set p2p.persistent_peers "$(cat /gnoroot/shared-data/node_p2p.id)@gnoland-val:26656"
gnoland start -genesis /gnoroot/shared-data/genesis.json
depends_on:
gnoland-val:
condition: service_healthy
restart: true

supernova:
build:
dockerfile: supernova.Dockerfile
args:
supernova_version: v1.2.1
image: ghcr.io/gnolang/supernova:1.3.1
command: >
-sub-accounts 10 -transactions 200 -url http://gnoland:26657
-sub-accounts 10 -transactions 100 -url http://gnoland-rpc:26657
-mnemonic "source bonus chronic canvas draft south burst lottery vacant surface solve popular case indicate oppose farm nothing bullet exhibit title speed wink action roast"
restart: always
-mode PACKAGE_DEPLOYMENT
restart: unless-stopped
networks:
- gnoland-net

Expand All @@ -61,5 +118,5 @@ volumes:
driver: local
grafana_data:
driver: local
gnoland:
gnoland-shared:
driver: local
13 changes: 0 additions & 13 deletions misc/telemetry/gnoland/Dockerfile

This file was deleted.

19 changes: 0 additions & 19 deletions misc/telemetry/gnoland/setup.sh

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ providers:
folder: Gno
type: file
options:
path: /var/lib/grafana/dashboards
path: /etc/grafana/provisioning/dashboards
Loading

0 comments on commit 549da06

Please sign in to comment.