CDC replication
Stream changes from Postgres to other systems
CDC replication only for bulk syncs
CDC replication from PostgreSQL is only available for Bulk Syncs.
When bulk-syncing data from PostgreSQL into your data warehouse, it's preferred (though not required) for Polytomic to utilize CDC (change data capture) replication. This will avoid Polytomic running full table scans to figure out changes since the last sync. Rather, Polytomic will be able to capture changes in real-time without scanning your tables.
Requirements
To enable this, Polytomic requires the following of your PostgreSQL instance:
-
PostgreSQL 10 or later (Polytomic uses the
pgoutput
plugin) -
wal_level = logical
The PostgreSQLwal_level
setting must be set tological
. You can determine the current setting by running this query:my_database=> show wal_level; wal_level --- logical (1 row)
If you're on AWS Aurora, you need to set
rds.logical_replication
to1
in your database cluster parameter group then reboot your database (see AWS's documentation here). -
Set
wal_sender_timeout
to five minutes -
Available replication slot
Polytomic requires a replication slot for each bulk sync (you will be setting up one sync per database, unless you want different sync schedules for the same database). The same slot will be used for all tables synced as part of that bulk sync configuration. The total number of replication slots is set using themax_replication_slots
PostgreSQL configuration parameter.You can inspect current replication slots using the
pg_replication_slots
system table:SELECT * FROM pg_replication_slots;
-
A Publication for the tables you wish to replicate
PostgreSQL uses a Publication to track changes to one or more tables. Only a user with the owner role for a table may add it to the publication. However, after the publication is created, any user with theREPLICATION
property may use it.You can create a publication for specific tables, or for all tables (including those created in the future). Note that creating a publication for all tables may result in increased storage consumption.
Tables without primary keys
PostgreSQL requires a replica identity in order to replicate
DELETE
andUPDATE
operations for a table. By default this is the primary key for non-system tables.In order to add a table without a primary key to the publication, it must have a replica identity set.
Failure to set the replica identity will cause errors when attempting to update or delete rows, so if your table does not have a primary key or replica identity, don't add it to the Publication.
Your Publication name can be anything; you’ll enter it in Polytomic's PostgreSQL connection configuration.
-- create a publication for specific tables CREATE PUBLICATION polytomic FOR TABLE <table>, <table>, ...; -- create a publication for all current and future tables CREATE PUBLICATION polytomic FOR ALL TABLES;
-
User with the
REPLICATION
property set
TheREPLICATION
property signals that a PostgreSQL user (role) is able to manage replication slots. Polytomic will use this property to create a replication slot for each bulk sync configured.If you're on AWS RDS, you can set this with the following query:
GRANT rds_replication TO polytomic_user;
Otherwise, use this query instead:
ALTER ROLE polytomic_user WITH REPLICATION;
See Advanced Settings below for information on manually managing the replication slot.
Set Polytomic to read your logical replication log
Once the previous steps are done, be sure to turn on the Use logical replication for bulk syncs setting and set the publication name in your PostgreSQL connection configuration:

Advanced settings
Replication slots
Polytomic's default behaviour is to manage the replication slot for you: it will create a slot when you enable a bulk sync from your PostgreSQL database and remove it when the sync is disabled.
You can manage the slot yourself if you prefer. If you wish to create the slot yourself, you may do so outside of Polytomic then specify the slot name under Advanced Settings in your sync configuration.
The replication slot must be created with the pgoutput
plugin:
select pg_create_logical_replication_slot('polytomic', 'pgoutput');
Propagating deletes
Polytomic offers two options for propagating deletes during CDC replication:
- Soft deletes
- Hard deletes
By default, Polytomic will propagate deletes from your source as soft-deletes, by marking the __polytomic_deleted_at
column for each destination table with a datetime reflecting deletion time.
You can choose to replace this behaviour by propagating deletes as hard deletes instead. To do so, simply check the Propagate hard deletes instead of soft deletes checkbox in the Advanced settings option at the bottom of your sync configuration:

Tables without primary keys
Incremental CDC updates is not supported by default for tables without primary keys. When syncing from such tables, Polytomic offers a Mirror table without a unique identifier boolean option to decide behaviour:

When turned on, replicating the source table involves querying and copying it in one transaction. This may be infeasible if the table is large.
When turned off, replicating the source table results in the destination table being append-only, where each row update in the source table results in a new row in the destination table. This is recommended if the source table is large (common with event tables).
Polytomic ignores this setting when replicating tables with primary keys.
Updated 22 days ago