An OTP application to be plugged in Riak TS to make it behave like a Graphite storage node.
Download and install Riak TS from this url: http://docs.basho.com/riak/ts/latest/downloads/ . Test version: 1.3.0.
Create a Riak TS table to store the graphite metrics:
CREATE TABLE metrics (
family varchar not null,
series varchar not null,
time timestamp not null,
metric double,
PRIMARY KEY (
(family, series, quantum(time, 1, 'd')
),family, series, time))
The family field
will always have the value graphite
. The series
field
will contain the metric names, and the metric
field will contain the value
$ rebar3 compile
Enable Riak Search (see [https://docs.basho.com/riak/kv/2.1.4/configuring/search/#enabling-riak-search](the
documentation)). In riak.conf
: search = on
.
Then restart Riak TS, now you should have search enabled, and be able to add and setup an index:
Create search index:
PUT /search/index/metric_names_index
Associate index with bucket:
PUT /buckets/bucket/props -d '{"props":{"search_index":"metric_names_index"}}'
We also need set CRDTs, so let's create a bucket type for that:
riak-admin bucket-type create sets '{"props":{"datatype":"set"}}'
Change advanced.config to add the path where the source were built, and start the application:
[
{riak_kv, [
{add_paths, [ "/path/to/ebin"
]}
]},
{vm_args, [
{'-s graphite_riakts_app', ""}
]}
].
create and edit /etc/riak/graphite_riakts.conf
(currently hardcoded), below
are all the configuration fields and default values:
{ ranch_port, 2003}.
{ ranch_backlog_nb, 100}.
{ ranch_max_connections_nb, 100}.
{ ranch_acceptors_nb, 100}.
{ ranch_idle_timeout_ms, 5000}.
{ table_name, "metrics"}.
{ bucket_name, "metric_names"}.
{ riakts_ip, "127.0.0.1"}.
{ riakts_port, 8087}.
{ riakts_write_batch_size, 1000}.
{ riakkv_ip, "127.0.0.1"}.
{ riakkv_port, 8087}.
{ riaksearch_ip, "127.0.0.1"}.
{ riaksearch_port, 8087}.
{ riaksearch_indexing_batch_size, 100}.
{ riaksearch_batch_indexing_timeout_ms, 2000}.
Then restart Riak TS
The application uses https://github.com/ninenines/ranch to maintain a pool of TCP acceptors, to receive graphite line protocol data. Currently only the line protocol is supported. Graphite data is stored in the Riak TS table.
Indexing metric names is done by Riak Search. To avoid Riak Search impacting Riak TS and overloading it, indexing is done in batches. New Metric names are stored in a non-indexed Riak KV bucket, whose keys are added to a CRDT Set. A genserver is polling this set for new keys, and consumes them, fetch the list of metric names, and store them in a Riak KV bucket that is indexed by Riak Search. The daemon then waits for the last metric of this batch to be indexed, before moving to the next batch.
When new graphite metrics arrive, the graphite_riakts application check if the metric names are known, or new (they need to be indexed). Instead of querying Riak KV bucket for each metric names, the application maintains an in memory local cache, using fogfish's [git://github.com/fogfish/cache.git](caching library). When the node starts, this memory cache is warmed up, from Riak Search. This is done during RiakTS startup, so the node is usable only when the warmup is finished. That's on purpose, to avoid having slow nodes (warming up their cache) in a cluster.
Implemented at /metrics/find
endpoint, allows to query metric names
curl http://127.0.0.1:8098/metrics/find?query=some.metric.*.path.*'
Not implemented yet
Not implemented yet
If you use the default configuration, the Riak TS, Riak KV and Riak Search IP is the same (127.0.0.1), which means that everything will run inside the Riak TS instance. However, Riak Search is not officially supported in Riak TS, especially AAE issues can happen. It is possible to have one instance of Riak TS and one for Riak KV + Riak Search and have then either running on the same node (you need to use a different token and port), or on different nodes.
Further testing may dismiss this issue, for instance if it's proven that Riak Search works fine with standard (non Riak TS) buckets, including AAE and index cleanup.
- expose memory cache configuration
- improve memory cache warmup: load recent metrics only, check cache is full, configuratbility
- HTTP APIs needs to be implemented
- Pickle protocl support
- implement retention configuration
- implement expiration (deletion of data in Riak TS table, cleanup of KV/Search, and memory cache)
- implement downsampling
- better documentation