Systems | Development | Analytics | API | Testing

April 2023

Running Ray in Cloudera Machine Learning to Power Compute-Hungry LLMs

Lost in the talk about OpenAI is the tremendous amount of compute needed to train and fine-tune LLMs, like GPT, and Generative AI, like ChatGPT. Each iteration requires more compute and the limitation imposed by Moore’s Law quickly moves that task from single compute instances to distributed compute. To accomplish this, OpenAI has employed Ray to power the distributed compute platform to train each release of the GPT models.

Building Cloud Native Data Apps on Premises

Data is core to decision making today and organizations often turn to the cloud to build modern data apps for faster access to valuable insights. With cloud operating models, decision making can be accelerated, leading to competitive advantages and increased revenue. Can you achieve similar outcomes with your on-premises data platform? You absolutely can.

Ingest your data with Cloudera Streaming & DataFlow

Cloudera Data in Motion is designed to enable businesses to respond to critical events in real-time and streamline their data capture, processing, and distribution, while maintaining security and governance. It offers an open architecture for maximum flexibility and control over resources, addressing data in motion challenges.

Using Dead Letter Queues with SQL Stream Builder

Cloudera SQL Stream builder gives non-technical users the power of a unified stream processing engine so they can integrate, aggregate, query, and analyze both streaming and batch data sources in a single SQL interface. This allows business users to define events of interest for which they need to continuously monitor and respond quickly. A dead letter queue (DLQ) can be used if there are deserialization errors when events are consumed from a Kafka topic.

Discovering Data Monetization Opportunities in Financial Services

Data has become an essential driver for new monetization initiatives in the financial services industry. With the vast amount of data collected from customers, transactions, and market movements, among other sources, this abundance offers tremendous potential for financial institutions to extract valuable insights that can inform business decisions, improve customer service, and create new revenue streams.

Industry Impact | The Hybrid Data Platform for Insurance

In the age of connected everything, insurers face new challenges and opportunities as they strive to deliver personalized insurance coverage while minimizing costs and preventing fraud. With the Cloudera Data Platform, insurers can unlock the power of real-time data and analytics to make insurance more precise, more personalized, and more profitable. By building a 360-view of each customer, streamlining claims and services, and unlocking usage-based insurance with IoT sensor data, insurers can manage risks and create opportunities to transform for today and stay ahead tomorrow.

Data Management with Cloudera CDP | Gartner Show Floor Showdown

At Gartner's Data and Analytics Summit in Orlando Florida, Director of Product Management, David Dichmann, presented the Cloudera Data Platform (CDP) for Data Management. Using Flood data provided by Gartner together with additional data assets, we demonstrate how Cloudera's Hybrid, Open, Portable and Secure data platform could assist data practitioners in developing an early warning detection service for potential coastal flooding for the state of Florida.