Snowflake
The Snowflake Data Pool lets you ingest data from a Snowflake table or view into Propel's Serverless ClickHouse.
Consider using Propel on top of Snowflake when:
- You require sub-second query performance for dashboards or reports.
- You need to support high-concurrency and high-availability data workloads, such as customer-facing or mission-critical applications.
- You require fast data access through an API for web and mobile apps.
- You are building B2B SaaS or consumer applications that require multi-tenant access controls.
Get startedā
Follow our step-by-step Snowflake setup guide to connect your Snowflake data warehouse to Propel.
Architecture overviewā
Snowflake Data Pools connect to the specified Snowflake table, view, or Dynamic Table and automatically synchronize the data from your Snowflake into your Data Pool in Propel.
Featuresā
Snowflake Data Pools support the following features:
Feature name | Supported | Notes |
---|---|---|
Syncs inserts, updates, and deletes. | ā | See How Propel syncs inserts, updates, and deletes. |
Re-sync | ā | See Re-sync section. |
Configurable sync interval | ā | See How Propel syncs section. It can be configured to occur at intervals ranging from every minute to every 24 hours. |
Sync Pausing / Resuming | ā | |
Real-time updates | ā | See the Real-time updates section. |
Real-time deletes | ā | See the Real-time deletes section. |
Batch Delete API | ā | See Batch Delete API. |
Batch Update API | ā | See Batch Update API. |
API configurable | ā | See Management API docs. |
Terraform configurable | ā | See Propel Terraform docs. |
How Propel syncs inserts, updates, and deletesā
Propel uses a Snowflake stream, a form of change data capture, to synchronize records from Snowflake to your Data Pool. How records are synced depends on the table engine you select when you create your Data Pool.
- MergeTree Data Pools (append-only data): Syncs inserts and ignores deletes and updates.
- ReplacingMergeTree Data Pools (mutable records): Syncs inserts, updates records with the same sorting key, and deletes records with the same sorting key.
Read our guide on Selecting table engine and sorting key for information.
Data syncing can be configured to occur at intervals ranging from every minute to every 24 hours. The frequency at which you sync data into Propel should depend on your data freshness requirements and your upstream pipeline. For instance, if you promised your customers a data freshness of 3 hours and you have a dbt job that runs every hour, it would not make sense to run syncs every minute since there will be no new data to sync. Running syncs every hour would be sufficient.
Propel wakes up the Snowflake warehouse every time a sync occurs. Please keep in mind the associated costs when selecting the sync interval. We recommend setting up the minimum auto-suspend period, which is 60 seconds.
Supported data structuresā
Below are the Snowflake data structures that Propel supports.
Name | Description |
---|---|
Table | A Snowflake table. Must have change tracking enabled. |
View | A Snowflake view. The underlying tables must have change tracking enabled. |
Dynamic Tables | A Snowflake dynamic table. |
Data requirementsā
To use any of the supported data structures with Propel, they must meet the following requirements:
- The table or view will need change tracking enabled so Propel can detect the inserts, updates, and deletes. Dynamic tables have change tracking enabled by default. Check our How to enable change tracking guide for step-by-step instructions.
Use the TIMESTAMP_LTZ
type for date time columns as it specifies the timezone.
Re-syncing dataā
Snowflake Data Pools that use the ReplacingMergeTree table engine support the re-syncing of all your data. This is a safe operation, meaning that Propel will continue to serve data to your application during a re-sync, will not create duplicates, and will not persist any deleted data in Snowflake.
You should consider re-syncing your Data Pool when the Snowflake stream is deleted, failed, or stale, causing the Data Pool sync to fail. This happens when the table is re-created, or the stream is manually deleted. It also happens when you run a dbt full-refresh
operation that re-creates the table.
During a re-sync, Propel will recreate the Snowflake stream on the table or view.
Resyncing is also available via the API with the reSyncSnowflakeDataPool
mutation.
Data Typesā
The table below describes default data type mappings from Snowflake to Propel types. When creating a Snowflake Data Pool, you can modify these default mappings. For instance, if you know that a column originally typed as a NUMBER contains a UNIX timestamp, you can convert it to a TIMESTAMP by changing the default mapping.
Snowflake Type | Propel Type | Notes |
---|---|---|
DATETIME, TIMESTAMP_NTZ, TIMESTAMP_LTZ, TIMESTAMP_TZ, TIMESTAMP(p) | TIMESTAMP | Timestamps without a timezone will be synced as UTC. |
DATE | DATE | |
BOOLEAN | BOOLEAN | |
NUMBER(pā¤9, s=0) | INT32 | |
NUMBER(pā¤18, s=0) | INT64 | |
NUMBER(pā¤9, s>0) | FLOAT | |
NUMBER(pā¤18, s>0), NUMBER(p>18, s) | DOUBLE | |
NUMBER(p, s), DECIMAL(p, s), NUMERIC(p, s) | Depends on precision and scale | |
NUMBER(38, s>0), INT, INTEGER, BIGINT, SMALLINT, TINYINT, BYTEINT | DOUBLE | Snowflake maps INT types to NUMBER(38, 0), which Propel represents as a DOUBLE as it can have numbers larger than INT64. See Snowflake numeric types. |
FLOAT, FLOAT4, FLOAT8, DOUBLE, DOUBLE PRECISION, REAL | DOUBLE | |
VARCHAR, CHAR, CHARACTER, STRING, TEXT, BINARY, VARBINARY | STRING | |
TIME | STRING | |
VARIANT, OBJECT, ARRAY | JSON | |
GEOGRAPHY, GEOMETRY | Not supported |
Key Guidesā
- How to set up a Snowflake Data Pool
- How to enable change tracking on a Snowflake table
- Selecting the right timestamp column for your Data Pool
Frequently Asked Questionsā
How long does the data take to sync from Snowflake to Propel?
Data is synchronized based on your configured sync interval. You can sync as often as every minute or up to 24 hours. Once a sync is initiated, the data will be available in Propel and served via the API in 1-3 minutes.
Does Propel read through to Snowflake on every query?
No. Propel only queries Snowflake to sync newly inserted, updated, or deleted records.
Management APIā
Below is the relevant API documentation for the Snowflake Data Pool.
Queriesā
Mutationsā
- Create Data Pool
- Modify Data Pool
- Delete Data Pool by ID
- Delete Data Pool by unique name
- Create Snowflake Data Source
- Modify Snowflake Data Source
- Delete Data Source by ID
- Delete Data Source by unique name
Limitsā
No limits at this point.
Related blog postsā
- How to build a Snowflake API
- How to Deduplicate Events in Snowflake with dbt
- How to set up development and production environments in Snowflake
- How to get your data from an AWS RDS database into Snowflake
- How Does Snowflake Storage Work? (Databases & Schemas)
- What Is the Difference Between a Database and a Warehouse in Snowflake?
- How to build Snowflake data apps with GraphQL
- What Is a Multi-Cluster Virtual Warehouse in Snowflake Data Platform?
- How to build in-product analytics with Snowflake and GraphQL
- What Are Warehouses in Snowflake Data Analytics Platform?