Systems | Development | Analytics | API | Testing

Data Streaming

How to source data from AWS DynamoDB to Confluent using the DynamoDB CDC Source Connector

This is a one-minute video showing an animated architectural diagram of the integration between Amazon DynamoDB and Confluent Cloud using the all new, fully managed DynamoDB CDC Source connector. This real-time data pipeline doesn’t require you to write or maintain code.

What is Watermark Alignment? | Apache Flink in Action

Watermark alignment is a relatively new feature in Apache Flink. It lets you cope with the problem of needing to temporally join streams with mismatched event frequencies, e.g., one stream’s updates are much more frequent than those of the stream(s) with which you need to join it. In this video we’ll break the feature down, and relate how it can help you better manage your Apache Flink integration.

Build Scalable AI-Enabled Applications with Confluent and AWS

In this video, Confluent and AWS address enterprises' challenges in deploying generative AI and how Confluent Cloud and Amazon Bedrock empower organizations to build scalable, AI-enabled applications. We'll explore how Confluent's comprehensive data streaming platform enables you to stream, connect, and govern data at scale, creating real-time, contextualized, and trustworthy applications that differentiate generative AI.

How to Set Idle Timeouts | Apache Flink in Action

This video covers setting an idle timeout on a watermark generator when joining data in Apache Flink. This can be used when you have two streams, one that has frequent updates, and one that has infrequent updates, and you need to join data without waiting for a fresh watermark from the infrequent one.

Confluent Cloud for Apache Flink | Interactive Tables for Flink SQL Workspaces

When developing or debugging a stream processing pipeline with Flink SQL, it’s common to inspect each processing step's output to ensure data is being transformed properly. However, comprehending the resulting data stream's structure, distribution, and characteristics entails executing multiple ad-hoc SQL queries, which can be time-consuming and tedious. Additionally, isolating specific subsets of the stream for analysis or debugging often involves even more queries, adding to the complexity and time required.