Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support providing configuration from YAML files #123

Merged
merged 9 commits into from
Jul 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 17 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,10 @@ An overview of the proxy architecture and logical flow can be viewed [here](http

## Quick Start

In order to run the proxy, you'll need to set some environment variables to configure it properly.
In order to run the proxy, you'll need to set some environment variables or pass reference to YAML configuration file.
Below you'll find a list with the most important variables along with their default values.
The required ones are marked with a comment.
The required ones are marked with a comment. Variable names for YAML configuration file do not have `ZDM_` prefix and
are lower-cased.

```shell
ZDM_ORIGIN_CONTACT_POINTS=10.0.0.1 #required
Expand All @@ -36,7 +37,7 @@ ZDM_READ_MODE=PRIMARY_ONLY
ZDM_LOG_LEVEL=INFO
```

The environment variables must be set and exported for the proxy to work.
The environment variables (or YAM configuration file) must be set for the proxy to work.

In order to get started quickly, in your local environment, grab a copy of the binary distribution in the
[Releases](https://github.com/datastax/zdm-proxy/releases) page. For the recommended installation in a production
Expand All @@ -55,6 +56,19 @@ export ZDM_TARGET_PASSWORD=cassandra \
./zdm-proxy-v2.0.0 # run the ZDM proxy executable
```

If you prefer to use YAML configuration file, an equivalent setup would look like:

```shell
$ cat zdm-config.yml
origin_contact_points: 10.0.0.1
target_contact_points: 10.0.0.2
origin_username: cassandra
origin_password: cassandra
target_username: cassandra
target_password: cassandra
$ ./zdm-proxy-v2.0.0 --config=./zdm-config.yml # run the ZDM proxy executable
```

At this point, you should be able to connect some client such as [CQLSH](https://downloads.datastax.com/#cqlsh) to the proxy
and write data to it and the proxy will take care of forwarding the requests to both clusters concurrently.

Expand Down
2 changes: 1 addition & 1 deletion RELEASE_PROCESS.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ All published container images can be found at [https://hub.docker.com/r/datasta

Before triggering the build and publish process for an official/stable release, three files need to be updated, the `RELEASE_NOTES`, `CHANGELOG` and `main.go`.

Please update the ZDM version displayed during component startup in `main.go`:
Please update the ZDM version displayed during component startup in `launch.go`:
```go
const ZdmVersionString = "2.0.0"
```
Expand Down
182 changes: 182 additions & 0 deletions docs/assets/zdm-config-reference.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,182 @@
# This variable determines which cluster is currently considered the primary cluster.
lukasz-antoniak marked this conversation as resolved.
Show resolved Hide resolved
# At the start of the migration, the primary cluster is Origin, as it contains all the data.
# In Phase 4 of the migration, once all the existing data has been transferred and any validation/reconciliation
# step has been successfully executed, you can switch the primary cluster to be Target.
# Valid values: ORIGIN, TARGET.
primary_cluster: ORIGIN

# This variable determines how reads are handled by the ZDM Proxy. Valid values:
# PRIMARY_ONLY - reads are only sent synchronously to the primary cluster. This is the default behavior.
# DUAL_ASYNC_ON_SECONDARY - reads are sent synchronously to the primary cluster and also asynchronously
# to the secondary cluster. See Phase 3: Enable asynchronous dual reads.
read_mode: PRIMARY_ONLY

# Whether the ZDM Proxy should replace standard CQL function calls in write
# requests with a value computed at proxy level. Currently, only the replacement
# of now() is supported. Disabled by default. Enabling this will have a noticeable performance impact.
# replace_cql_functions: false

# Timeout (in ms) when performing the initialization (handshake) of a proxy-to-secondary cluster
# connection that will be used solely for asynchronous dual reads. If this timeout occurs, the asynchronous
# reads will not be sent. This has no impact on the handling of synchronous requests: the ZDM Proxy will
# continue to handle all synchronous reads and writes normally.
# async_handshake_timeout_ms: 4000

# Specifies logging level.
# log_level: INFO

# List of peer ZDM proxy instances. This configuration parameter should be *identical*
# (elements form the list placed in the same order) through all ZDM proxies.
# proxy_topology_addresses: 127.0.1.1, 127.0.1.2, 127.0.1.3

# Index of local ZDM proxy instance within "proxy_topology_addresses" list.
lukasz-antoniak marked this conversation as resolved.
Show resolved Hide resolved
# Given "proxy_topology_addresses: 127.0.1.1, 127.0.1.2, 127.0.1.3", value of
# "proxy_topology_index" should equal "0" in the configuration file present on server
# 127.0.1.1, "1" on 127.0.1.2 and "2" on 127.0.1.3.
# proxy_topology_index: 0

# Number of tokens each proxy instance owns. The default value of 8 should work for
# the majority of use case. To learn more about this concept, look into "virtual nodes" in Apache Cassandra.
# proxy_topology_num_tokens: 8

# Comma separated list of origin cluster contact points.
# When this configuration is present, "origin_secure_connect_bundle_path"
# should be left blank.
origin_contact_points: 127.0.0.1

# Port used when connecting to nodes from origin cluster.
origin_port: 9042

# If origin cluster is DataStax Astra, path to secure connection bundle.
lukasz-antoniak marked this conversation as resolved.
Show resolved Hide resolved
# Users do not need to list contact points ("origin_contact_points") when
# they leverage connection bundle mechanism.
# origin_secure_connect_bundle_path:

# Local data center for origin cluster.
# origin_local_datacenter:

# Origin cluster username.
origin_username: user1

# Origin cluster password.
origin_password: pass1

# Timeout (in ms) when attempting to establish a connection from the proxy to origin cluster.
# origin_connection_timeout_ms: 30000

# CA certificate used when verifying identity of origin nodes.
# origin_tls_server_ca_path:

# Public key used when establishing connectivity with origin cluster.
# origin_tls_client_cert_path:

# Private key used to secure communication with origin cluster.
# origin_tls_client_key_path:

# Comma separated ist of target cluster contact points.
# When this configuration is present, "target_secure_connect_bundle_path"
# should be left blank.
target_contact_points: 127.0.0.2

# If target cluster is DataStax Astra, path to secure connection bundle.
# Users do not need to list contact points ("target_contact_points") when
# they leverage connection bundle mechanism.
# target_secure_connect_bundle_path:

# Local data center for target cluster.
# target_local_datacenter: DC1

# Port used when connecting to nodes from target cluster.
target_port: 9042

# Target cluster username.
target_username: user2

# Target cluster password.
target_password: pass2

# Timeout (in ms) when attempting to establish a connection from the proxy to target cluster.
# target_connection_timeout_ms: 30000

# CA certificate used when verifying identity of target nodes.
# target_tls_server_ca_path:

# Public key used when establishing connectivity with target cluster.
# target_tls_client_cert_path:

# Private key used to secure communication with target cluster.
# target_tls_client_key_path:

# Listen address of ZDM proxy.
proxy_listen_address: localhost

# Port number on which ZDM proxy is listening.
proxy_listen_port: 14002

# Global timeout (in ms) of a request at proxy level. This variable determines how long the
# ZDM Proxy will wait for one cluster (in case of reads) or both clusters (in case of writes)
# to reply to a request. If this timeout is reached, the ZDM Proxy will abandon that request
# and no longer consider it as pending, thus freeing up the corresponding internal resources.
# Note that, in this case, the ZDM Proxy will not return any result or error: when the client
# application’s own timeout is reached, the driver will time out the request on its side.
# proxy_request_timeout_ms: 10000

# Defines hot many clients may connect to single ZDM proxy instance. ZDM proxy closes
# connection if threshold is reached.
# proxy_max_client_connections: 1000

# In the CQL protocol every request has a unique id, named stream id. This variable allows
# you to tune the maximum pool size of the available stream ids managed by the ZDM Proxy
# per client connection. In the application client, the stream ids are managed internally
# by the driver, and in most drivers the max number is 2048 (the same default value used
# in the proxy). If you have a custom driver configuration with a higher value, you should
# change this property accordingly.
# proxy_max_stream_ids: 2048

# CA certificate used when verifying identity of connecting client applications.
# proxy_tls_ca_path:

# Public key used when establishing connectivity with client applications.
# proxy_tls_cert_path:

# Private key used by ZDM proxy to encrypt connection between itself and client applications
# proxy_tls_key_path:

# If true enforces mutual TLS between proxy and client applications
# proxy_tls_require_client_auth: false

# If true ZDM proxy exposes performance metrics in Prometheus format.
# metrics_enabled: true

# Network interface used to expose Prometheus metrics.
# metrics_address: localhost

# Port used to expose Prometheus metrics.
# metrics_port: 14001

# Prefix prepended to each metric name.
# metrics_prefix: zdm

# List of histogram buckets for measuring latency of origin cluster
# metrics_origin_latency_buckets_ms: 1, 4, 7, 10, 25, 40, 60, 80, 100, 150, 250, 500, 1000, 2500, 5000, 10000, 15000

# List of histogram buckets for measuring latency of target cluster
# metrics_target_latency_buckets_ms: 1, 4, 7, 10, 25, 40, 60, 80, 100, 150, 250, 500, 1000, 2500, 5000, 10000, 15000

# List of histogram buckets for measuring latency of asynchronous
# read requests routed to target cluster. See parameter "read_mode".
# metrics_async_read_latency_buckets_ms: 1, 4, 7, 10, 25, 40, 60, 80, 100, 150, 250, 500, 1000, 2500, 5000, 10000, 15000

# Frequency (in ms) with which heartbeats will be sent on cluster connections
# (i.e. all control and request connections to Origin and Target). Heartbeats
# keep idle connections alive.
# heartbeat_interval_ms: 30000

# Below properties define reconnection strategy for establishing control connection.
# heartbeat_retry_interval_min_ms: 250
# heartbeat_retry_interval_max_ms: 30000
# heartbeat_retry_backoff_factor: 2

# Control connection failure threshold. If threshold is exceeded,
# readiness probe of ZDM will report failure and pod will be recreated.
# heartbeat_failure_threshold: 1
3 changes: 2 additions & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,13 @@ require (
github.com/google/uuid v1.1.1
github.com/jpillora/backoff v1.0.0
github.com/kelseyhightower/envconfig v1.4.0
github.com/mcuadros/go-defaults v1.2.0
github.com/prometheus/client_golang v1.3.0
github.com/prometheus/client_model v0.1.0
github.com/rs/zerolog v1.20.0
github.com/sirupsen/logrus v1.6.0
github.com/stretchr/testify v1.8.0
gopkg.in/yaml.v3 v3.0.1
)

require (
Expand All @@ -33,5 +35,4 @@ require (
github.com/prometheus/procfs v0.0.8 // indirect
golang.org/x/sys v0.3.0 // indirect
gopkg.in/inf.v0 v0.9.1 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
)
7 changes: 6 additions & 1 deletion go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -62,11 +62,15 @@ github.com/kr/text v0.1.0 h1:45sCR5RtlFHMR4UwH9sdQ5TC8v0qDQCHnXt+kaKSTVE=
github.com/kr/text v0.1.0/go.mod h1:4Jbv+DJW3UT/LiOwJeYQe1efqtUx/iVham/4vfdArNI=
github.com/matttproud/golang_protobuf_extensions v1.0.1 h1:4hp9jkHxhMHkqkrB3Ix0jegS5sx/RkqARlsWZ6pIwiU=
github.com/matttproud/golang_protobuf_extensions v1.0.1/go.mod h1:D8He9yQNgCq6Z5Ld7szi9bcBfOoFv/3dc6xSMkL2PC0=
github.com/mcuadros/go-defaults v1.2.0 h1:FODb8WSf0uGaY8elWJAkoLL0Ri6AlZ1bFlenk56oZtc=
github.com/mcuadros/go-defaults v1.2.0/go.mod h1:WEZtHEVIGYVDqkKSWBdWKUVdRyKlMfulPaGDWIVeCWY=
github.com/modern-go/concurrent v0.0.0-20180228061459-e0a39a4cb421/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q=
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q=
github.com/modern-go/reflect2 v0.0.0-20180701023420-4b7aa43c6742/go.mod h1:bx2lNnkwVCuqBIxFjflWJWanXIb3RllmbCylyMrvgv0=
github.com/modern-go/reflect2 v1.0.1/go.mod h1:bx2lNnkwVCuqBIxFjflWJWanXIb3RllmbCylyMrvgv0=
github.com/mwitkow/go-conntrack v0.0.0-20161129095857-cc309e4a2223/go.mod h1:qRWi+5nqEBWmkhHvq77mSJWrCKwh8bxhgT7d/eI7P4U=
github.com/niemeyer/pretty v0.0.0-20200227124842-a10e7caefd8e h1:fD57ERR4JtEqsWbfPhv4DMiApHyliiK5xCTNVSPiaAs=
github.com/niemeyer/pretty v0.0.0-20200227124842-a10e7caefd8e/go.mod h1:zD1mROLANZcx1PVRCS0qkT7pwLkGfwJo4zjcN/Tysno=
github.com/pierrec/lz4/v4 v4.0.3 h1:vNQKSVZNYUEAvRY9FaUXAF1XPbSOHJtDTiP41kzDz2E=
github.com/pierrec/lz4/v4 v4.0.3/go.mod h1:gZWDp/Ze/IJXGXf23ltt2EXimqmTUXEy0GFuRQyBid4=
github.com/pkg/errors v0.8.0/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
Expand Down Expand Up @@ -127,8 +131,9 @@ golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8T
golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
gopkg.in/alecthomas/kingpin.v2 v2.2.6/go.mod h1:FMv+mEhP44yOT+4EoQTLFTRgOQ1FBLkstjWtayDeSgw=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127 h1:qIbj1fsPNlZgppZ+VLlY7N33q108Sa+fhmuc+sWQYwY=
gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20200227125254-8fa46927fb4f h1:BLraFXnmrev5lT+xlilqcH8XK9/i0At2xKjWk4p6zsU=
gopkg.in/check.v1 v1.0.0-20200227125254-8fa46927fb4f/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/inf.v0 v0.9.1 h1:73M5CoZyi3ZLMOyDlQh031Cx6N9NDJ2Vvfl76EDAgDc=
gopkg.in/inf.v0 v0.9.1/go.mod h1:cWUDdTG/fYaXco+Dcufb5Vnc6Gp2YChqWtbxRZE0mXw=
gopkg.in/yaml.v2 v2.2.1/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
Expand Down
19 changes: 18 additions & 1 deletion proxy/launch.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@ package main

import (
"context"
"flag"
"fmt"
"github.com/datastax/zdm-proxy/proxy/pkg/config"
"github.com/datastax/zdm-proxy/proxy/pkg/runner"
log "github.com/sirupsen/logrus"
Expand All @@ -10,6 +12,12 @@ import (
"syscall"
)

// TODO: to be managed externally
const ZdmVersionString = "2.2.0"

var displayVersion = flag.Bool("version", false, "display the ZDM proxy version and exit")
var configFile = flag.String("config", "", "specify path to ZDM configuration file")

func runSignalListener(cancelFunc context.CancelFunc) {
sigCh := make(chan os.Signal, 1)
signal.Notify(sigCh, syscall.SIGINT, syscall.SIGTERM)
Expand All @@ -24,7 +32,16 @@ func runSignalListener(cancelFunc context.CancelFunc) {
}

func launchProxy(profilingSupported bool) {
conf, err := config.New().ParseEnvVars()
if *displayVersion {
fmt.Printf("ZDM proxy version %v\n", ZdmVersionString)
return
}

// Always record version information (very) early in the log
log.Infof("Starting ZDM proxy version %v", ZdmVersionString)

conf, err := config.New().LoadConfig(*configFile)

if err != nil {
log.Errorf("Error loading configuration: %v. Aborting startup.", err)
os.Exit(-1)
Expand Down
17 changes: 0 additions & 17 deletions proxy/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,27 +7,10 @@ package main

import (
"flag"
"fmt"
"os"

log "github.com/sirupsen/logrus"
)

// TODO: to be managed externally
const ZdmVersionString = "2.2.0"

var displayVersion = flag.Bool("version", false, "Display the ZDM proxy version and exit")

func main() {

flag.Parse()
if *displayVersion {
fmt.Printf("ZDM proxy version %v\n", ZdmVersionString)
os.Exit(0)
}

// Always record version information (very) early in the log
log.Infof("Starting ZDM proxy version %v", ZdmVersionString)

launchProxy(false)
}
1 change: 0 additions & 1 deletion proxy/main_profiling.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ var cpuProfile = flag.String("cpu_profile", "", "write cpu profile to the specif
var memProfile = flag.String("mem_profile", "", "write memory profile to the specified file")

func main() {

flag.Parse()

// the cpu profiling is enabled at startup and is periodically collected while the proxy is running
Expand Down
Loading
Loading