You’ve been handed the not-so-easy task of scoping a managed Kafka for your team. How do you start the shortlist? Post something on Reddit? Skim read a gazillion review blogs? Crash Google Chrome opening a thousand tabs to compare feature lists? If you’re going to run a Kafka POC with two or three vendors, or you’re trying to find the best Kafka for your business, how can you narrow down your selection? Let’s get to it.
Once upon a time (2017), in an office far far away, you may have been cornered in a conversation with someone from Legal about GDPR. It could have gone something like this: “You there, Data Engineer” “Yep, that’s me” “What PII do we have residing in this Apache Kafka database?” You probably mumbled something about Kafka not being a database. “And who can read/ write the data?
A decade ago, all developers could talk about was breaking down the monolith and event-driven architectures. Especially in the financial services industry, to become more nimble and accelerate their application delivery. They leveraged messaging systems to decouple the application, and specifically Apache Kafka has transitioned from being a data integration technology to the leading messaging system for microservices.
2021 is the year data engineers shift from analytics support to governing data flows. As data becomes an organization's most valuable commodity, it is those with the governance to confidently analyze and act on it that will develop the next big applications.
Splunk is a technology that made processing huge volumes and complex datasets accessible to security and IT teams. Despite its strengths for monitoring and investigation, Splunk is a bit of a one-way street. Once it's in Splunk, it's not that easy to stream the data elsewhere in great volume. And it doesn’t mean it’s the best technology for all IT and Security use cases. Or the cheapest.
In the previous posts in this series, we have discussed Kerberos and LDAP authentication for Kafka. In this post, we will look into how to configure a Kafka cluster to use a PAM backend instead of an LDAP one. The examples shown here will highlight the authentication-related properties in bold font to differentiate them from other required security properties, as in the example below. TLS is assumed to be enabled for the Apache Kafka cluster, as it should be for every secure cluster.
If anyone's ever been to AWS ReInvent in Vegas before, you'll know it's a crazy ride. This year we missed out (at least we have cleaner consciences and healthier wallets). But the high quality of content hadn't changed. We've been binging on sessions ‘til the bitter end (it officially ended Friday). So for our community, here is a summary of a few talks related to Apache Kafka.
“We’ve seen two years’ worth of digital transformation in two months” said Microsoft’s Satya Nadella. Due to COVID-19, digital transformation roadmaps have been deleted, redrafted, doubled down and accelerated by up to a decade. Traditional companies are moving by osmosis towards streaming technologies such as Apache Kafka to kick off new digital services. But how much should it cost to experience 2030 in 2021?