Systems | Development | Analytics | API | Testing

Getting Started with Cloudera Data Platform Operational Database (COD)

Operational Database is a relational and non-relational database built on Apache HBase and is designed to support OLTP applications, which use big data. The operational database in Cloudera Data Platform has the following components: Atlas provides open metadata management and governance capabilities to build a catalog of all assets, and also classify and govern these assets. The SDX layer of CDP leverages the full spectrum of Atlas to automatically track and control all data assets.

Addressing the Three Scalability Challenges in Modern Data Platforms

In legacy analytical systems such as enterprise data warehouses, the scalability challenges of a system were primarily associated with computational scalability, i.e., the ability of a data platform to handle larger volumes of data in an agile and cost-efficient way.

Make Your Models Matter: What It Takes to Maximize Business Value from Your Machine Learning Initiatives

We are excited by the endless possibilities of machine learning (ML). We recognise that experimentation is an important component of any enterprise machine learning practice. But, we also know that experimentation alone doesn’t yield business value. Organizations need to usher their ML models out of the lab (i.e., the proof-of-concept phase) and into deployment, which is otherwise known as being “in production”.

New Applied ML Prototypes Now Available in Cloudera Machine Learning

It’s no secret that Data Scientists have a difficult job. It feels like a lifetime ago that everyone was talking about data science as the sexiest job of the 21st century. Heck, it was so long ago that people were still meeting in person! Today, the sexy is starting to lose its shine. There’s recognition that it’s nearly impossible to find the unicorn data scientist that was the apple of every CEO’s eye in 2012.

NiFi as a Function in DataFlow Service

With the general availability of Cloudera DataFlow for the Public Cloud (CDF-PC), our customers can now self-serve deployments of Apache NiFi data flows on Kubernetes clusters in a cost effective way providing auto scaling, resource isolation and monitoring with KPI-based alerting. You can find more information in this release announcement blog post and in this technical deep dive blog post. Any customer willing to run NiFi flows efficiently at scale should now consider adopting CDF-PC.

The Rise of Unstructured Data

The word “data” is ubiquitous in narratives of the modern world. And data, the thing itself, is vital to the functioning of that world. This blog discusses quantifications, types, and implications of data. If you’ve ever wondered how much data there is in the world, what types there are and what that means for AI and businesses, then keep reading!

Defining Simplicity for Enterprise Software as "a 10 Year Old Can Demo it"

Arjun (my son) sat next to me at my desk. He was a bit nervous but we had practiced 3 times before he was ‘on stage’ in front of hundreds of people and the zoom meeting turned to him. My ten year old began to demonstrate how to deploy an Operational Database in AWS, showcasing how auto-scaling worked and how to set up replication. All of the sales team and my colleagues were quite impressed with him, and I am very proud of him.

Introducing Cloudera DataFlow for the Public Cloud

With the rise of streaming data (or, data-in-motion), companies must figure out how to deliver high-scale data ingestion, transformation, and management. In this session, you’ll see how Cloudera Data Platform’s (CDP) new DataFlow service provides real-time data movement capabilities to address hybrid cloud use cases.

Sentry to Ranger - A Concise Guide

Cloudera Data Platform (CDP) brings many improvements to customers by merging technologies from the two legacy platforms, Cloudera Enterprise Data Hub (CDH) and Hortonworks Data Platform (HDP). CDP includes new functionalities as well as superior alternatives to some previously existing functionalities in security and governance. One such major change for CDH users is the replacement of Sentry with Ranger for authorization and access control.