Skip to content

MYLogical

Wiki Maintenance Bot edited this page Oct 25, 2024 · 23 revisions

MySQL/MariaDB Replication

Another possibility is to connect to a MySQL/MariaDB database instance to consume a transaction-based replication feed using global transaction identifiers (GTIDs). This is primarily intended for migration use-cases, in which it is desirable to have a minimum- or zero-downtime migration from MySQL to CockroachDB. For an overview of MySQL replication with GTIDs, refer to MySQL Replication with Global Transaction Identifiers. For an overview of MariaDB replication refer to MariaDB Replication Overview.

Usage:
  replicator mylogical [flags]

Flags:
      --applyTimeout duration             the maximum amount of time to wait for an update to be applied (default 30s)
      --defaultGTIDSet string             default GTIDSet. Used if no state is persisted
      --dlqTableName ident                the name of a table in the target schema for storing dead-letter entries (default replicator_dlq)
      --fetchMetadata                     fetch column metadata explicitly, for older version of MySQL that don't support binlog_row_metadata
      --flushPeriod duration              flush queued mutations after this duration (default 1s)
      --flushSize int                     ideal batch size to determine when to flush mutations (default 1000)
      --metricsAddr string                a host:port on which to serve metrics and diagnostics
      --parallelism int                   the number of concurrent database transactions to use (default 16)
      --quiescentPeriod duration          how often to retry deferred mutations (default 10s)
      --replicationProcessID uint32       the replication process id to report to the source database (default 10)
      --retireOffset duration             delay removal of applied mutations (default 24h0m0s)
      --scanSize int                      the number of rows to retrieve from staging (default 10000)
      --schemaRefresh duration            controls how often a watcher will refresh its schema. If this value is zero or negative, refresh behavior will be disabled. (default 1m0s)
      --sourceConn string                 the source database's connection string
      --stageMarkAppliedLimit int         limit the number of mutations to be marked applied in a single statement (default 100000)
      --stageSanityCheckPeriod duration   how often to validate staging table apply order (-1 to disable) (default 10m0s)
      --stageSanityCheckWindow duration   how far back to look when validating staging table apply order (default 1h0m0s)
      --stageUnappliedPeriod duration     how often to report the number of unapplied mutations in staging tables (-1 to disable) (default 1m0s)
      --stagingConn string                the staging database's connection string
      --stagingCreateSchema               automatically create the staging schema if it does not exist
      --stagingIdleTime duration          maximum lifetime of an idle connection (default 1m0s)
      --stagingJitterTime duration        the time over which to jitter database pool disconnections (default 15s)
      --stagingMaxLifetime duration       the maximum lifetime of a database connection (default 5m0s)
      --stagingMaxPoolSize int            the maximum number of staging database connections (default 128)
      --stagingSchema atom                a SQL database schema to store metadata in (default _replicator.public)
      --targetConn string                 the target database's connection string
      --targetIdleTime duration           maximum lifetime of an idle connection (default 1m0s)
      --targetJitterTime duration         the time over which to jitter database pool disconnections (default 15s)
      --targetMaxLifetime duration        the maximum lifetime of a database connection (default 5m0s)
      --targetMaxPoolSize int             the maximum number of target database connections (default 128)
      --targetSchema atom                 the SQL database schema in the target cluster to update
      --targetStatementCacheSize int      the maximum number of prepared statements to retain (default 128)
      --taskGracePeriod duration          how long to allow for task cleanup when recovering from errors (default 1m0s)
      --timestampLimit int                the maximum number of source timestamps to coalesce into a target transaction (default 1000)
      --userscript string                 the path to a configuration script, see userscript subcommand

Global Flags:
      --gracePeriod duration    allow background processes to exit (default 30s)
      --logDestination string   write logs to a file, instead of stdout
      --logFormat string        choose log output format [ fluent, text ] (default "text")
  -v, --verbose count           increase logging verbosity to debug; repeat for trace

The theory of operation is similar to the standard use case, the only difference is that Replicator connects to the source database to receive a replication feed, rather than act as the target for a webhook.

Staging Database Setup

Create the staging database _replicator in the staging cluster. The staging cluster is typically the same as the target cluster. It is recommended that you reduce the default GC time to five minutes, since these tables have a high volume of deletes.

ALTER DATABASE _replicator CONFIGURE ZONE USING gc.ttlseconds=300

MySQL/MariaDB Replication Setup

  • The MySQL 5.6 server should have the following settings:
      --gtid-mode=on
      --enforce-gtid-consistency=on
      --server-id=[a unique id]
      --log-bin=mysql-binlog
      --binlog-format=row
      --binlog-row-image=full
      --log-slave-updates=ON
  • The MySQL 5.7 server should have the following settings:
      --gtid-mode=on
      --enforce-gtid-consistency=on
      --binlog-row-image=full
      --server-id=[a unique id]
      --log-bin=log-bin
  • The MySQL 8 server should have the following settings:
      --gtid-mode=on
      --enforce-gtid-consistency=on
      --binlog-row-metadata=full
  • If server is MariaDB, it should have the following settings:
      --log-bin
      --server_id=[a unique id]
      --log-basename=master1
      --binlog-format=row
      --binlog-row-metadata=full
  • Verify the master status, on the MySQL/MariaDB server
    show master status;
  • Perform a backup of the database. Note: Starting with MariaDB 10.0.13, mysqldump automatically includes the GTID position as a comment in the backup file if either the --master-data or --dump-slave option is used.
   mysqldump -p db_name > backup-file.sql
  • Note the GTID state at the beginning of the backup, as reported in the backup file. For instance:
-- MySQL:
--
-- GTID state at the beginning of the backup
--

SET @@GLOBAL.GTID_PURGED=/*!80000 '+'*/ '6fa7e6ef-c49a-11ec-950a-0242ac120002:1-8';
-- MariaDB:
--
-- GTID to start replication from
--

SET GLOBAL gtid_slave_pos='0-1-1';
  • Import the database into Cockroach DB, following the instructions at Migrate from MySQL.
  • Run replicator mylogical with at least the --sourceConn, --targetConn , --defaultGTIDSet and --targetSchema. Set --defaultGTIDSet to the GTID state shown above.
Clone this wiki locally