Open Data Lakehouse powered by Iceberg for all your Data Warehouse needs
Since we announced the general availability of Apache Iceberg in Cloudera Data Platform (CDP), we are excited to see customers testing their analytic workloads on Iceberg.
Since we announced the general availability of Apache Iceberg in Cloudera Data Platform (CDP), we are excited to see customers testing their analytic workloads on Iceberg.
The evolution of healthcare has come a long way since local physicians made house calls and homespun remedies were formulated using items from the kitchen spice rack. Today’s healthcare is driven as much by the promise of emerging technologies centered on data processing and advanced analytics as by developing new and specialized drugs.
The best description of untrusted data I’ve ever heard is, “We all attend the QBR – Sales, Marketing, Finance – and present quarterly results, except the Sales reports and numbers don’t match Marketing numbers and neither match Finance reports. We argue about where the numbers came from, then after 45 minutes of digging for common ground, we chuck our shovels and abandon the call in disgust.” How would you go about fixing that situation?
Cloudera SQL Stream Builder (SSB) gives the power of a unified stream processing engine to non-technical users so they can integrate, aggregate, query, and analyze both streaming and batch data sources in a single SQL interface. This allows business users to define events of interest for which they need to continuously monitor and respond quickly. There are many ways to distribute the results of SSB’s continuous queries to embed actionable insights into business processes.
Over the past handful of years, systems architecture has evolved from monolithic approaches to applications and platforms that leverage containers, schedulers, lambda functions, and more across heterogeneous infrastructures. Cloudera Data Platform (CDP) is no different: it’s a hybrid data platform that meets organizations’ needs to get to grips with complex data anywhere, turning it into actionable insight quickly and easily.
As the use of ChatGPT becomes more prevalent, I frequently encounter customers and data users citing ChatGPT’s responses in their discussions. I love the enthusiasm surrounding ChatGPT and the eagerness to learn about modern data architectures such as data lakehouses, data meshes, and data fabrics. ChatGPT is an excellent resource for gaining high-level insights and building awareness of any technology. However, caution is necessary when delving deeper into a particular technology.
In this post, I will demonstrate how to use the Cloudera Data Platform (CDP) and its streaming solutions to set up reliable data exchange in modern applications between high-scale microservices, and ensure that the internal state will stay consistent even under the highest load.
We are thrilled to announce that the new DataFlow Designer is now generally available to all CDP Public Cloud customers. Data leaders will be able to simplify and accelerate the development and deployment of data pipelines, saving time and money by enabling true self service.
We just announced the general availability of Cloudera DataFlow Designer, bringing self-service data flow development to all CDP Public Cloud customers. In our previous DataFlow Designer blog post, we introduced you to the new user interface and highlighted its key capabilities. In this blog post we will put these capabilities in context and dive deeper into how the built-in, end-to-end data flow life cycle enables self-service data pipeline development.
Recently, we announced enhanced multi-function analytics support in Cloudera Data Platform (CDP) with Apache Iceberg. Iceberg is a high-performance open table format for huge analytic data sets. It allows multiple data processing engines, such as Flink, NiFi, Spark, Hive, and Impala to access and analyze data in simple, familiar SQL tables.