CDC replication only for Bulk Syncs
CDC replication from MySQL is only available for Bulk Syncs.
When bulk-syncing data from MySQL into your data warehouse, it's preferred (though not required) for Polytomic to utilize CDC (change data capture) replication. This will avoid Polytomic running full table scans to calculate changes since the last sync. Rather, Polytomic will be able to capture changes in real-time without scanning your tables.
To enable this, the following settings need to be enabled for your MySQL database:
- The Polytomic MySQL user needs to be configured with replication privileges. This can be done with the following query (replace
<username>with your MySQL Polytomic user):
GRANT SELECT, REPLICATION CLIENT, REPLICATION SLAVE ON *.* TO <username>@'%';
For example, if your Polytomic MySQL user is
polytomic then your query would be:
GRANT SELECT, REPLICATION CLIENT, REPLICATION SLAVE ON *.* TO polytomic@'%';
- Set the following on your database:
binlog_format: ROW binlog_row_image: full binlog_row_metadata: FULL slave_parallel_type: LOGICAL_CLOCK
The exact way to set these will depend on your MySQL hosting platform. If you're hosting MySQL yourself then you'll need to edit your
my.cnf file, whereas if you're on AWS RDS then you'll have to edit your parameter group as shown in this screenshot:
- Set a log retention period of at least 1 day. We recommend 7 days.
- Be sure to check the Use replication for bulk syncs box in your Polytomic MySQL connection configuration:
- Click Save.
Updated 8 months ago