Systems | Development | Analytics | API | Testing

Cloudera

HDFS Snapshot Best Practices

The snapshots feature of the Apache Hadoop Distributed Filesystem (HDFS) enables you to capture point-in-time copies of the file system and protect your important data against corruption, user-, or application errors. This feature is available in all versions of Cloudera Data Platform (CDP), Cloudera Distribution for Hadoop (CDH) and Hortonworks Data Platform (HDP).

The Art of Data Leadership | A discussion with Chief Digital Officer, Ray Kunik

Our Chief Data & Analytics Officer, Shayde Christian, sits down for a buzzworthy conversation with Chief Digital Officer Raymond L. Kunik Jr. to discuss the “other” CDO role, the science behind work-life integration, the impact and applications of #AI, and its correlation with a pretty sweet hobby.

Why Reinvent the Wheel? The Challenges of DIY Open Source Analytics Platforms

In their effort to reduce their technology spend, some organizations that leverage open source projects for advanced analytics often consider either building and maintaining their own runtime with the required data processing engines or retaining older, now obsolete, versions of legacy Cloudera runtimes (CDH or HDP).

Boosting Object Storage Performance with Ozone Manager

Ozone is an Apache Software Foundation project to build a distributed storage platform that caters to the demanding performance needs of analytical workloads, content distribution, and object storage use cases. The Ozone Manager is a critical component of Ozone. It is a replicated, highly-available service that is responsible for managing the metadata for all objects stored in Ozone. As Ozone scales to exabytes of data, it is important to ensure that Ozone Manager can perform at scale.

Applied Machine Learning Prototypes | The Future of Machine Learning

Applied Machine Learning Prototypes or AMPs, are pre-built applications that can be used as a starting point for your next machine learning project. These prototypes are designed to save time and resources by providing a tested and reliable solution to common machine learning problems. Cloudera + Dell + AMD.