Systems | Development | Analytics | API | Testing

Latest News

The challenges you'll face deploying machine learning models (and how to solve them)

In 2019, organizations invested $28.5 billion into machine learning application development (Statistica). Yet, only 35% of organizations report having analytical models fully deployed in production (IDC). When you connect those two statistics, it’s clear that there are a breadth of challenges that must be overcome to get your models deployed and running.

One billion files in Ozone

Apache Hadoop Ozone is a distributed key-value store that can manage both small and large files alike. Ozone was designed to address the scale limitations of HDFS with respect to small files. HDFS is designed to store large files and the recommended number of files on HDFS is 300 million for a Namenode, and doesn’t scale well beyond this limit.

Operational Database Availability

This blog post is part of a series on Cloudera’s Operational Database (OpDB) in CDP. Each post goes into more details about new features and capabilities. Start from the beginning of the series with, Operational Database in CDP. This blog post gives you an overview of the high availability configuration capabilities of Cloudera’s OpDB. Cloudera’s Operational Database (OpDB) is a cluster-based software, which comes configured for High Availability (HA) out of the box.

Augment EMR Workloads with CDP

The first thing that comes to mind when talking about synergy is how 2+2=5. Being the writer that he is, Mark Twain described it a lot more eloquently as “the bonus that is achieved when things work together harmoniously”. There is a multitude of product and business examples to illustrate the point and I particularly like how car manufacturers can bring together relatively small engines to do big things.

Decision Making in Uncertain Times

Leaders know that making good, fast decisions is challenging under the best of circumstances. But, the trickiest decisions are those we call “big bets” – unfamiliar and high-stakes decisions. When you have a crisis of uncertainty, such as the COVID-19 pandemic, which arrived at overwhelming speed and enormous scale, organizations face a potentially paralyzing volume of these big-bet decisions.

Machine learning in production: Human error is inevitable, here's how to prepare.

You did it. You have machine learning capabilities up and running in your organization. Success! What started as a few nascent experiments (and maybe a few failures) are now carefully constructed models racing along in full production—with the ability to scale into the hundreds or thousands of productional models in sight. Assembling your expert team of data scientists and custodians seems like a distant memory. Now you’re looking ahead to the future—growth, innovation, revenue!

Augmented Analytics - How Associative and AI Technologies Are Changing the Face of Analytics

It’s hard to believe that we are now over 30 years into data warehousing. In that time, we have seen major changes in tools to help user report on and analyse data. In the last twenty years, we have seen the evolution from reporting, ad hoc analysis and advanced analytics. Today, BI/Analytics is a mature market with self-service BI and visual analysis standards in most organisations with self-service data preparation also widely deployed.

For Business Agility, Focus on Data - Not on Data Management

Effectively managing data in an edge-to-cloud world is becoming increasingly complex. Enterprises need data management simplicity and agility to maximize the benefits they can get from their data. The enterprise that will succeed will shift resources away from mundane data management tasks to focus on using data to innovate and add business value.