Skip to content

Changefeed Performance

Bob Vawter edited this page Mar 25, 2024 · 22 revisions

The overall throughput and latency of cdc-sink is driven by the performance of the source changefeed and the target's ability to accept writes. There are a variety of configuration knobs that can be tweaked.

Notes for high-throughput clusters:

SET CLUSTER SETTING changefeed.mux_rangefeed.enabled = true; -- Fewer goroutines per range; might be default in 24.1?
SET CLUSTER SETTING changefeed.new_webhook_sink_enabled = true;
SET CLUSTER SETTING kv.rangefeed.closed_timestamp_refresh_interval = '3s';
SET CLUSTER SETTING kv.rangefeed.catchup_scan_concurrency = 64;  --default is 8
SET CLUSTER SETTING kv.rangefeed.concurrent_catchup_iterators = 64; --default is 16

-- Minimum required WITH options for webhook delivery.
CREATE CHANGEFEED FOR TABLE YCSB.USERTABLE
  INTO 'webhook-https://127.0.0.1:30004/ycsb/public?insecure_tls_skip_verify=true'
  WITH updated, resolved='1s', min_checkpoint_frequency='1s',
       webhook_sink_config='{"Flush":{"Bytes":1048576,"Frequency":"1s"}}';

In high-latency situations, consider bandwidth delay product effects and optimize for larger transfer windows.

Pre-splitting tables across a larger number of ranges also provides additional opportunities for concurrent data transfer between CockroachDB and cdc-sink.

Additional guidance at https://www.cockroachlabs.com/docs/stable/advanced-changefeed-configuration

Clone this wiki locally