GuidesRecipesAPI ReferenceChangelog
HomeSee demo
Guides

Parquet

Source and destination

Polytomic supports syncing both to and from Parquet files on the major cloud storage providers:

Create a connection to your Parquet files

Before syncing to or from your Parquet files, ensure that you've created a Polytomic connection to one of the above cloud storage systems.

Syncing to Parquet

You can sync to Parquet files from any of the integrations supported by Polytomic, whether from data warehouses, databases, SaaS applications like Salesforce and NetSuite, spreadsheets, or arbitrary APIs.

In general, you should use Bulk Syncs to sync to Parquet files. If you are syncing from a custom SQL query, then you should use Model Syncs.

Syncing from Parquet

Polytomic can sync Parquet files to any systems. Not just data warehouses, databases, and other storage systems, but also to SaaS applications like Salesforce as well as arbitrary APIs.

You can create a Polytomic data model on Parquet files sitting in cloud storage. This data model can be treated like any other data model. You can enrichit and sync it to any of your systems.

Syncing fields from a single file

To sync fields from a single Parquet file, create a data model on your cloud storage system. Then, in the Build model using section, select Single file. You will be prompted for the filename. Once you enter that, you'll see the familiar Polytomic model field list to choose from:

Syncing fields from a multi-file archive

If your source data collection is spread across a multi-file Parquet archive, simply select Multiple files in the Files selector. Polytomic will then list model fields to choose from, collected from all the files in your Parquet archive: