Ingest data from Kafka topics.
Ingest real-time data from self-hosted Kafka, Confluent Cloud, AWS MSK, or Redpanda to Propel.
Step-by-step instructions to connect your Kafka cluster to Propel.
The Kafka Data Pools connect to specified Kafka topics to ingest data in real-time into Propel.
Kafka Data Pools support the following features:
Feature name | Supported | Notes |
---|---|---|
Real-time ingestion | ✅ | See How the Kafka Data Pool works. |
Deduplication | ✅ | See the deduplication section. |
Batch Delete API | ✅ | See Batch Delete API. |
Batch Update API | ✅ | See Batch Update API. |
API configurable | ✅ | See API](/docs/management-api) docs. |
Terraform configurable | ✅ | See Terraform docs. |
The Kafka Data Pool connects to specified Kafka topic to read messages in real-time. It starts from the earliest available offset to start consuming messages.
These messages are then ingested into the Data Pool. Once in the Data Pool, you can query them via SQL, the Query APIs, or transform them with Materialized Views.
The Kafka Data Pool stores the message body in the _propel_payload
column and the Kafka and ingestion-related metadata in other columns.
This flexible approach allows JSON Kafka messages to be ingested without needing pre-defined schemas.
Column | Type | Description |
---|---|---|
_timestamp | TIMESTAMP | The timestamp of the message. |
_topic | STRING | The Kafka topic |
_key | STRING | The key of the message. |
_offset | INT64 | The offset of the message. |
_partition | INT64 | The partition of Kafka topic. |
_propel_payload | JSON | The raw message Payload in JSON. |
_propel_received_at | TIMESTAMP | When the message is read by Propel. |
The Kafka Data Pool automatically manages the deduplication of messages. This happens when messages are either sent twice by the producer or read twice due to intermittent connectivity between Propel and the Kafka stream. The uniqueness of a message is determined by the combination of _topic
, _partition
, and _offset
.
The Kafka Data Pool supports the ingestion of JSON messages that are stored in the _propel_payload
column.
Once your data is in a Kafka Data Pool, you can use Materialized Views to:
Ingest data from Kafka topics.
Ingest real-time data from self-hosted Kafka, Confluent Cloud, AWS MSK, or Redpanda to Propel.
Step-by-step instructions to connect your Kafka cluster to Propel.
The Kafka Data Pools connect to specified Kafka topics to ingest data in real-time into Propel.
Kafka Data Pools support the following features:
Feature name | Supported | Notes |
---|---|---|
Real-time ingestion | ✅ | See How the Kafka Data Pool works. |
Deduplication | ✅ | See the deduplication section. |
Batch Delete API | ✅ | See Batch Delete API. |
Batch Update API | ✅ | See Batch Update API. |
API configurable | ✅ | See API](/docs/management-api) docs. |
Terraform configurable | ✅ | See Terraform docs. |
The Kafka Data Pool connects to specified Kafka topic to read messages in real-time. It starts from the earliest available offset to start consuming messages.
These messages are then ingested into the Data Pool. Once in the Data Pool, you can query them via SQL, the Query APIs, or transform them with Materialized Views.
The Kafka Data Pool stores the message body in the _propel_payload
column and the Kafka and ingestion-related metadata in other columns.
This flexible approach allows JSON Kafka messages to be ingested without needing pre-defined schemas.
Column | Type | Description |
---|---|---|
_timestamp | TIMESTAMP | The timestamp of the message. |
_topic | STRING | The Kafka topic |
_key | STRING | The key of the message. |
_offset | INT64 | The offset of the message. |
_partition | INT64 | The partition of Kafka topic. |
_propel_payload | JSON | The raw message Payload in JSON. |
_propel_received_at | TIMESTAMP | When the message is read by Propel. |
The Kafka Data Pool automatically manages the deduplication of messages. This happens when messages are either sent twice by the producer or read twice due to intermittent connectivity between Propel and the Kafka stream. The uniqueness of a message is determined by the combination of _topic
, _partition
, and _offset
.
The Kafka Data Pool supports the ingestion of JSON messages that are stored in the _propel_payload
column.
Once your data is in a Kafka Data Pool, you can use Materialized Views to: