Systems | Development | Analytics | API | Testing

BI

Your Next Decision Could Change Lives: Why We Need Data Skills and Analytics

The year was 1993. The place, a little town in Sweden. A serial killer was on the loose. He randomly shot at people standing at bus stops or sitting in their cars, killing one and wounding many others. The residents of Malmö lived in fear. Window blinds were shut, playgrounds were deserted. The police didn’t know where to start.

The challenges you'll face deploying machine learning models (and how to solve them)

In 2019, organizations invested $28.5 billion into machine learning application development (Statistica). Yet, only 35% of organizations report having analytical models fully deployed in production (IDC). When you connect those two statistics, it’s clear that there are a breadth of challenges that must be overcome to get your models deployed and running.

One billion files in Ozone

Apache Hadoop Ozone is a distributed key-value store that can manage both small and large files alike. Ozone was designed to address the scale limitations of HDFS with respect to small files. HDFS is designed to store large files and the recommended number of files on HDFS is 300 million for a Namenode, and doesn’t scale well beyond this limit.

Pentaho 9.0 Teaser: Multcluster Enhancements

Many organizations want to run any workload from any location without the burden of rearchitecting or refactoring applications. Often, they’ll want to leverage their existing on-premise Hadoop investments and provide a seamless experience to data consumers when they migrate to the cloud to take advantage of the usability, scalability and elasticity of cloud-native solutions. Watch this video to learn more about the Pentaho’s 9.0 multicluster enhancements.

Operational Database Availability

This blog post is part of a series on Cloudera’s Operational Database (OpDB) in CDP. Each post goes into more details about new features and capabilities. Start from the beginning of the series with, Operational Database in CDP. This blog post gives you an overview of the high availability configuration capabilities of Cloudera’s OpDB. Cloudera’s Operational Database (OpDB) is a cluster-based software, which comes configured for High Availability (HA) out of the box.

Augment EMR Workloads with CDP

The first thing that comes to mind when talking about synergy is how 2+2=5. Being the writer that he is, Mark Twain described it a lot more eloquently as “the bonus that is achieved when things work together harmoniously”. There is a multitude of product and business examples to illustrate the point and I particularly like how car manufacturers can bring together relatively small engines to do big things.

Decision Making in Uncertain Times

Leaders know that making good, fast decisions is challenging under the best of circumstances. But, the trickiest decisions are those we call “big bets” – unfamiliar and high-stakes decisions. When you have a crisis of uncertainty, such as the COVID-19 pandemic, which arrived at overwhelming speed and enormous scale, organizations face a potentially paralyzing volume of these big-bet decisions.

Augmented Analytics - How Associative and AI Technologies Are Changing the Face of Analytics

It’s hard to believe that we are now over 30 years into data warehousing. In that time, we have seen major changes in tools to help user report on and analyse data. In the last twenty years, we have seen the evolution from reporting, ad hoc analysis and advanced analytics. Today, BI/Analytics is a mature market with self-service BI and visual analysis standards in most organisations with self-service data preparation also widely deployed.