Using Workload Identity Federation

Workload Identity Federation (WIF) lets you connect Polytomic to BigQuery without managing service account keys. Instead, your Google Cloud project trusts Polytomic's AWS identity directly, eliminating the need to create, download, rotate, and secure long-lived JSON key files.

Prerequisites

  • A Google Cloud project with the BigQuery API and IAM Service Account Credentials API enabled
  • Permission to create Workload Identity Pools in your GCP project (requires roles/iam.workloadIdentityPoolAdmin)
  • A GCP service account with the BigQuery roles your syncs require

Overview

Step 1: Create a Workload Identity Pool

In the Google Cloud Console:

  1. Go to IAM & Admin > Workload Identity Federation.
  2. Click Create Pool.
  3. Name the pool (e.g., polytomic-pool) and give it an optional description.
  4. Click Continue.

Or via gcloud:

gcloud iam workload-identity-pools create polytomic-pool \
  --location="global" \
  --display-name="Polytomic"

Step 2: Add an AWS Provider to the Pool

  1. In the pool you just created, click Add Provider.
  2. Select AWS as the provider type.
  3. Enter Polytomic's AWS account ID: 568237466542

    Contact Polytomic support for an execution role ARN if you would like to restrict access further.

  4. Click Continue.
  5. Click Save.

Or via gcloud:

gcloud iam workload-identity-pools providers create-aws polytomic-aws \
  --location="global" \
  --workload-identity-pool="polytomic-pool" \
  --account-id="568237466542" \

Step 3: Grant Service Account Impersonation

The Workload Identity Pool needs permission to act as a GCP service account. This is the service account whose BigQuery permissions Polytomic will use.

  1. Go to IAM & Admin > Service Accounts.

  2. Select (or create) the service account you want Polytomic to use for BigQuery access.

  3. Click the Permissions tab, then Grant Access.

  4. In the New principals field, enter:

    principalSet://iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/polytomic-pool/*
    

    Replace PROJECT_NUMBER with your GCP project number (found on the project dashboard).

  5. Assign the role Workload Identity User (roles/iam.workloadIdentityUser).

  6. Click Save.

Or via gcloud:

gcloud iam service-accounts add-iam-policy-binding \
  SA_EMAIL@PROJECT_ID.iam.gserviceaccount.com \
  --role="roles/iam.workloadIdentityUser" \
  --member="principalSet://iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/polytomic-pool/*"

Step 4: Ensure the Service Account Has BigQuery Permissions

The service account must have the appropriate BigQuery roles. At minimum:

  • BigQuery Data Viewer (roles/bigquery.dataViewer) — to read data
  • BigQuery Job User (roles/bigquery.jobUser) — to run queries
  • BigQuery Data Editor (roles/bigquery.dataEditor) — if using Polytomic to write to BigQuery

If you use the Extract option for bulk reads, also grant:

  • Storage Object Admin (roles/storage.objectAdmin) on the GCS bucket used for extraction

Step 5: Download the Credential Configuration File

  1. In the Google Cloud Console, go to IAM & Admin > Workload Identity Federation.

  2. Select your pool, then select the AWS provider.

  3. Click Connected Service Accounts, then select your service account.

  4. Click Download Config and choose the format Credential Configuration File.

  5. Save the downloaded JSON file. It will look similar to:

    {
      "type": "external_account",
      "audience": "//iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/polytomic-pool/providers/polytomic-aws",
      "subject_token_type": "urn:ietf:params:aws:token-type:aws4_request",
      "service_account_impersonation_url": "https://iamcredentials.googleapis.com/v1/projects/-/serviceAccounts/SA_EMAIL:generateAccessToken",
      "token_url": "https://sts.googleapis.com/v1/token",
      "credential_source": {
        "environment_id": "aws1",
        "region_url": "http://169.254.169.254/latest/meta-data/placement/availability-zone",
        "url": "http://169.254.169.254/latest/meta-data/iam/security-credentials",
        "regional_cred_verification_url": "https://sts.{region}.amazonaws.com?Action=GetCallerIdentity&Version=2011-06-15"
      }
    }
    

Important: This file does not contain secrets. It only describes how to perform the token exchange. The actual authentication happens at runtime using Polytomic's AWS IAM identity.

Step 6: Configure the Connection in Polytomic

  1. In Polytomic, go to Connections and click Add Connection.
  2. Select Google BigQuery.
  3. Set Authentication method to Workload Identity Federation.
  4. Upload the credential configuration JSON file from Step 5.
  5. Enter your Google Cloud project ID (the project containing your BigQuery datasets).
  6. Optionally set a Location if your datasets are in a specific region.
  7. Click Save and verify the connection test passes.

Troubleshooting

"Permission denied" when testing the connection

  • Verify the service account has roles/bigquery.jobUser and roles/bigquery.dataViewer on the project.
  • Check that the Workload Identity Pool principal has roles/iam.workloadIdentityUser on the service account.
  • Confirm the attribute condition (if set) matches Polytomic's actual IAM role ARN.

"Invalid grant" or token exchange errors

  • Ensure the AWS account ID in the provider matches Polytomic's account ID exactly.
  • Verify the credential configuration file was downloaded for the correct provider and service account pair.

Bulk sync extraction fails

  • Ensure the service account has roles/storage.objectAdmin on the configured GCS bucket.
  • The GCS bucket must be in the same project or the service account must have cross-project access.