Systems | Development | Analytics | API | Testing

Kafka

Architecting Apache Kafka for GDPR compliance

Once upon a time (2017), in an office far far away, you may have been cornered in a conversation with someone from Legal about GDPR. It could have gone something like this: “You there, Data Engineer” “Yep, that’s me” “What PII do we have residing in this Apache Kafka database?” You probably mumbled something about Kafka not being a database. “And who can read/ write the data?

Picking up the pieces of your monolith breakdown

A decade ago, all developers could talk about was breaking down the monolith and event-driven architectures. Especially in the financial services industry, to become more nimble and accelerate their application delivery. They leveraged messaging systems to decouple the application, and specifically Apache Kafka has transitioned from being a data integration technology to the leading messaging system for microservices.

Kafka to Splunk: Data mesh for security & IT

Splunk is a technology that made processing huge volumes and complex datasets accessible to security and IT teams. Despite its strengths for monitoring and investigation, Splunk is a bit of a one-way street. Once it's in Splunk, it's not that easy to stream the data elsewhere in great volume. And it doesn’t mean it’s the best technology for all IT and Security use cases. Or the cheapest.

How to configure clients to connect to Apache Kafka Clusters securely - Part 3: PAM authentication

In the previous posts in this series, we have discussed Kerberos and LDAP authentication for Kafka. In this post, we will look into how to configure a Kafka cluster to use a PAM backend instead of an LDAP one. The examples shown here will highlight the authentication-related properties in bold font to differentiate them from other required security properties, as in the example below. TLS is assumed to be enabled for the Apache Kafka cluster, as it should be for every secure cluster.

AWS re:Invent: Apache Kafka takeaways

If anyone's ever been to AWS ReInvent in Vegas before, you'll know it's a crazy ride. This year we missed out (at least we have cleaner consciences and healthier wallets). But the high quality of content hadn't changed. We've been binging on sessions ‘til the bitter end (it officially ended Friday). So for our community, here is a summary of a few talks related to Apache Kafka.

Kafka Total Cost of Ownership: What are you missing?

“We’ve seen two years’ worth of digital transformation in two months” said Microsoft’s Satya Nadella. Due to COVID-19, digital transformation roadmaps have been deleted, redrafted, doubled down and accelerated by up to a decade. Traditional companies are moving by osmosis towards streaming technologies such as Apache Kafka to kick off new digital services. But how much should it cost to experience 2030 in 2021?

Kafka infrastructure, monitoring, data - Which is your priority?

At the heart of Kafka is real-time data. With data at the center of any Kafka environment, it should be the area that gets the most attention, but typically it gets the least. This happens because we see most organizations split their Kafka efforts into three areas: infrastructure, monitoring, and data operations.

Considerations when moving your Apache Kafka to the cloud

Are you running your organization's Apache Kafka on-premise? If you are and you’re still reading this article, it’s more than likely that Kafka is or will be a keystone of your data infrastructure. But it’s also likely your teams are tired of the cost and complexity required to scale it, meaning your honeymoon with Kafka is coming to an end. So what does the imminent migration mean?