Systems | Development | Analytics | API | Testing

Lenses

New Apache Kafka to AWS S3 Connector

Many in the community have been asking us to develop a new Kafka to S3 connector for some time. So we’re pleased to announce it's now available. It’s been designed to deliver a number of benefits over existing S3 connectors. Like our other Stream Reactors, the connector extends the standard connect config adding a parameter for a SQL command (Lenses Kafka Connect Query Language or “KCQL”). This defines how to map data from the source (in this case Kafka) to the target (S3).

Deploy turn-key DataOps for AWS MSK

Running your own Kafka is starting to feel like wading through oatmeal. We’re not the only ones thinking that. The majority of organizations we speak to have or are in the process of moving their Kafka to a managed service. If you’re already an AWS-shop, Managed Streaming for Apache Kafka (MSK) is a no-brainer. It is the same Kafka that we know and love and integrated with other AWS services such as IAM, Cloudwatch, Cloudtrail, KMS, VPC and more.

On the importance of load testing Kafka

Socrates preached, “To know thyself is the beginning of wisdom.” This ancient Greek anecdote applies to your modern Apache Kafka project: developers, go forth and load test your real-time application to understand the capacity and limitations of your project before deployment. Failure to do so will cost you time and money (e.g. Robinhood’s outage on a historic trading day). Load testing your real-time applications has three main objectives.

Get your GitOps for real-time apps on Apache Kafka & Kubernetes

Infrastructure as code has been an important practice of DevOps for years. Anyone running an Apache Kafka data infrastructure and running on Kubernetes, the chances are you’ve probably nailed defining your infrastructure this way. If you’re running on Kubernetes, you’re likely using operators as part of your CI/CD toolchain to automate your deployments.

Archive data from to S3 with the new Kafka Connect connector

The new open-source #ApacheKafka Connect sink connector for #S3 gives you full control on how to sink data to S3 and save money on long term storage costs in #Kafka. The connector has the ability to flush data out in a number of different formats including #AVRO, #JSON, #Parquet and #Binary as well as ability to create S3 buckets based on partitions, metadata fields and value fields.