Google BigQuery
Source and destination
-
Go to https://console.cloud.google.com/apis/credentials/serviceaccountkey to create a new service account.
-
Select these BigQuery roles:
- BigQuery Job User
- BigQuery Data Viewer
- BigQuery Metadata Viewer
- BigQuery Data Editor (only if you plan on also syncing data to BigQuery).
- Select JSON for Key type.
-
Click Create and save the JSON key file.
-
In Polytomic, go to Connections → Add Connection → Google BigQuery.
-
Upload the JSON service account key from step 3 to Polytomic's Service account credentials field.
- Optional: you can fill in the Location field if you want to restrict Polytomic to creating data in a particular region (for example:
EU
). - Click Save.
Note: if you are trying to write "Record" field types in BigQuery, Polytomic may modify field names in order to comply with BigQuery field naming rules. The following rules apply:
- Only the first 300 characters are used.
- Field names consist of only upper/lowercase letters, numbers, and underscores.
-
,.
,_
.- All other characters are dropped.
Syncing to BigQuery
Bulk syncs
Setting partition keys
When using our Bulk Syncs feature to write data to BigQuery, you have the option of setting a partition key on each table. While optional and usually unneeded, doing so is recommended for large data sets (e.g. in-app event streams). Setting a partition key will increases your query speed on the resulting data.
You can set partition keys on each table by clicking on the 'more' button on each table in your bulk sync config as shown in this screenshot:
Propagating deletes to BigQuery
Tracking fields and deletes
When syncing from databases and data warehouses, deletes will not propagate for tables with tracking fields set, no matter what the setting below is set to.
As with all data warehouses, Polytomic offers two options for propagating deletes to BigQuery:
- Soft deletes
- Hard deletes
By default, Polytomic will propagate deletes from your source as soft-deletes, by marking the __polytomic_deleted_at
column for each table with a datetime reflecting deletion time.
You can choose to replace this behaviour by propagating deletes as hard deletes instead. To do so, simply check the Propagate hard deletes instead of soft deletes checkbox in the Advanced settings option at the bottom of your sync configuration:
Expiring tables after inactivity
Polytomic can automatically expire all BigQuery tables used in a bulk sync after a certain period of inactivity. To enforce this, turn on the Expire tables after inactivity advanced setting:
You will then be prompted to enter a number of days after which expiration will take place:
Note that every time the sync runs Polytomic will reset its internal day counter to zero and restart the expiration window countdown.
Updated 4 months ago