CDC replication only for bulk syncs
CDC replication from PostgreSQL is only available for Bulk Syncs.
When bulk-syncing data from PostgreSQL into your data warehouse, it's preferred (though not required) for Polytomic to utilize CDC (change data capture) replication. This will avoid Polytomic running full table scans to figure out changes since the last sync. Rather, Polytomic will be able to capture changes in real-time without scanning your tables.
To enable this, Polytomic requires the following of your PostgreSQL instance:
- PostgreSQL 10 or later (Polytomic uses the
wal_level = logical.
wal_levelsetting must be set to
logical. You can determine the current setting by running this query:
my_database=> show wal_level; wal_level ----------- logical (1 row)
- Available replication slot.
Polytomic requires a replication slot for each bulk sync; the same slot will be used for all tables synced as part of that bulk sync configuration. The total number of replication slots is set using the
max_replication_slotsPostgreSQL configuration parameter.
You can inspect current replication slots using the
pg_replication_slots system table:
select * from pg_replication_slots;
- User with
REPLICATIONproperty signals that a PostgreSQL user (role) is able to manage replication slots. Polytomic will use this property to create a replication slot for each bulk sync configured. You can set it with the following query:
ALTER ROLE polytomic_user WITH REPLICATION;
A Publication for the tables you wish to replicate.
PostgreSQL uses a Publication to track changes to one or more tables. Only a user with the owner role for a table may add it to the publication. After the publication is created, however, any user with the
REPLICATIONproperty may use it.
You can create a publication for specific tables, or for all tables (including those created in the future). Note that creating a publication for all tables may result in increased storage consumption.
Your Publication name can be anything; you’ll enter it in Polytomic's PostgreSQL connection configuration.
-- create a publication for specific tables CREATE PUBLICATION polytomic FOR TABLE <table>, <table>, ...; -- create a publication for all current and future tables CREATE PUBLICATION polytomic FOR ALL TABLES;
Once the previous steps are done, be sure to turn on the Use logical replication for bulk syncs setting and set the publication name in your PostgreSQL connection configuration:
By default Polytomic will create a replication slot when you enable a bulk sync from your PostgreSQL database and remove it when the sync is disabled.
If you wish to create the slot yourself you may do so and specify it under Advanced Settings.
The replication slot must be created with the
select pg_create_logical_replication_slot('polytomic', 'pgoutput');
Updated 8 months ago