Ingest real-time data from self-hosted Kafka, Confluent Cloud, AWS MSK, or Redpanda to Propel.

Get started with Kafka

Step-by-step instructions to connect your Kafka cluster to Propel.

Architecture

The Kafka Data Pools connect to specified Kafka topics to ingest data in real-time into Propel.

Features

Kafka Data Pools support the following features:

Feature nameSupportedNotes
Real-time ingestionSee How the Kafka Data Pool works.
DeduplicationSee the deduplication section.
Batch Delete APISee Batch Delete API.
Batch Update APISee Batch Update API.
API configurableSee API](/docs/management-api) docs.
Terraform configurableSee Terraform docs.

How does the Kafka Data Pool work?

The Kafka Data Pool connects to specified Kafka topic to read messages in real-time. It starts from the earliest available offset to start consuming messages.

These messages are then ingested into the Data Pool. Once in the Data Pool, you can query them via SQL, the Query APIs, or transform them with Materialized Views.

Schemaless ingestion

The Kafka Data Pool stores the message body in the _propel_payload column and the Kafka and ingestion-related metadata in other columns.

This flexible approach allows JSON Kafka messages to be ingested without needing pre-defined schemas.

ColumnTypeDescription
_timestampTIMESTAMPThe timestamp of the message.
_topicSTRINGThe Kafka topic
_keySTRINGThe key of the message.
_offsetINT64The offset of the message.
_partitionINT64The partition of Kafka topic.
_propel_payloadJSONThe raw message Payload in JSON.
_propel_received_atTIMESTAMPWhen the message is read by Propel.

Message deduplication

The Kafka Data Pool automatically manages the deduplication of messages. This happens when messages are either sent twice by the producer or read twice due to intermittent connectivity between Propel and the Kafka stream. The uniqueness of a message is determined by the combination of _topic, _partition, and _offset.

Supported formats

The Kafka Data Pool supports the ingestion of JSON messages that are stored in the _propel_payload column.

If you need AVRO support, please contact us.

Transforming data

Once your data is in a Kafka Data Pool, you can use Materialized Views to: