Google BigQuery

Source and destination

  1. Go to https://console.cloud.google.com/apis/credentials/serviceaccountkey to create a new service account.

  2. Select these BigQuery roles:

  • BigQuery Job User
  • BigQuery Data Viewer
  • BigQuery Metadata Viewer

If you plan on using BigQuery as a destination, you'll also need:

  • BigQuery Data Editor
  1. Select JSON for Key type.
  1. Click Create and save the JSON key file.

  2. In Polytomic, go to Connections → Add Connection → Google BigQuery.

  3. Paste the contents of the BigQuery JSON key file from step 3 in Polytomic's Service account credentials text box.

  1. Click Save.

Enabling Write Support

Polytomic can also sync your data into predefined tables in BigQuery. Follow these steps to enable this:

  1. Open the BigQuery Console and create a new dataset called polytomic (it must be named exactly this). For most people, the rest of the options can be left as default.
  1. Open the new dataset.
  1. Click the Create table button in the new dataset:
  1. Give the table a name that describes the data the table will hold, and then click the Add field button. Label the column and fill out the data type. Repeat this for as many fields as you'd like to sync from Polytomic.

Note: If you want to use Append mode, your schema must include a column called polytomic_job_timestamp with type Timestamp.

Note: If you are trying to write "Record" field types in BigQuery, Polytomic may modify field names in order to comply with BigQuery field naming rules. The following rules apply:

  • Only the first 300 characters are used;
  • field names consist of only upper/lowercase letters, numbers, and underscores;
  • -, ., are converted to _; and
  • all other characters are dropped.
  1. Your table should now be visible inside Polytomic.

Creating new tables

Polytomic can also create new tables in BigQuery. The user must have Data Editor (or better) permissions in order to create the required dataset and tables.

You should be able to create a new table using the Create new table.. target for BigQuery. A new table will be created after you save the sync for the first time.

Once the table is created, the table will never be dropped or altered by Polytomic. If you want to change the schema, either modify the schema by hand and reflect the updates in Polytomic, or get Polytomic to create another new table.

Before saving the sync, you'll be asked to input a table name, which will be used when creating the new table. The new columns will be named and typed based on the source column names.

For each selected field, Polytomic will create a table with the following column types:

Polytomic Type

BigQuery Type

Array

Bytes

Boolean

Boolean

Datetime

Timestamp

Number

Numeric

Object

Bytes

String

String

Unknown

Bytes

Example

Suppose we want to create a new table that contains user IDs, their organization ID, the user's name, and the user's last login time.

You can click "Preview" to see what the columns will be named during table creation. The model field names may be transformed slightly.

Note the following:

  • Polytomic will not create nested fields (type Record) or array fields (mode Repeated) at this time. While we support syncing to these fields, if you want to use them in your schema, you can create the table without these fields, and then augment the table after from the console.
  • Table name and column names may be augmented to fit naming conventions.
  • Duplicate field names will be made unambiguous by prepending the model name (e.g. the two id fields turn into 'user_id' and 'company_id' respectively).
  • A polytomic_job_timestamp field will always be included in the schema, regardless of sync mode.