Systems | Development | Analytics | API | Testing

Latest News

Admission Control Architecture for Cloudera Data Platform

Apache Impala is a massively parallel in-memory SQL engine supported by Cloudera designed for Analytics and ad hoc queries against data stored in Apache Hive, Apache HBase and Apache Kudu tables. Supporting powerful queries and high levels of concurrency Impala can use significant amounts of cluster resources. In multi-tenant environments this can inadvertently impact adjacent services such as YARN, HBase, and even HDFS.

Talend iPaaS momentum grows. Talend recognized in the 2021 Gartner Magic Quadrant for Enterprise iPaaS

As organizations continue to embrace cloud-based computing as the cornerstone of their digital transformation, the integration platform as a service (iPaaS) has become a critical component of their integration environments. An iPaaS solution simplifies the integration of data, applications, and systems, whether in the cloud or on-premises, through unified support for API, application, data, and B2B integration styles.

How Cloudera DataFlow Enables Successful Data Mesh Architectures

In this blog, I will demonstrate the value of Cloudera DataFlow (CDF), the edge-to-cloud streaming data platform available on the Cloudera Data Platform (CDP), as a Data integration and Democratization fabric. Within the context of a data mesh architecture, I will present industry settings / use cases where the particular architecture is relevant and highlight the business value that it delivers against business and technology areas.

The Great Data Revolution Is Here, and Qlik Customers Are at the Heart of It

Data – the amount we create, how we create it, how it is accessed (think both people and Artificial Intelligence/machines), and how we use it to inform, propel and influence everyone and everything is one of the biggest challenges and opportunities we face in our lifetime. And it’s driving enormous change.

Struggling to Manage your Multi-Tenant Environments? Use Chargeback!

If your organization is using multi-tenant big data clusters (and everyone should be), do you know the usage and cost efficiency of resources in the cluster by tenants? A chargeback or showback model allows IT to determine costs and resource usage by the actual analytic users in the multi-tenant cluster, instead of attributing those to the platform (“overhead’) or IT department. This allows you to know the individual costs per tenant and set limits in order to control overall costs.

An Introduction to Ranger RMS

Cloudera Data Platform (CDP) supports access controls on tables and columns, as well as on files and directories via Apache Ranger since its first release. It is common to have different workloads using the same data – some require authorizations at the table level (Apache Hive queries) and others at the underlying files (Apache Spark jobs). Unfortunately, in such instances you would have to create and maintain separate Ranger policies for both Hive and HDFS, that correspond to each other.

Our reflections on the 2021 Gartner Magic Quadrant for Data Quality Solutions

Success for any business starts with data that is easily discoverable, understandable, and of value to the people who need it. We call this type of data “healthy data.” You should look at a wide set of measures and metrics to determine whether data is healthy or not, but at the core of all healthy data is a high level of quality.

Four Pillars of an Agile Data Infrastructure

Forbes Insights defines the modernized data center as being built to change just as much as it is built to last. One of the key pillars for a modernized data center is an agile data infrastructure. The Forbes Insights briefing explains, “This means it’s not wedded to any specific deployment method or solution set.

Closing the Gap Between Data and Action - All in One Cloud

A favorite moment of mine is when I get to share Qlik’s vision for Active Intelligence with a customer for the first time. It usually goes like this: genuine excitement about the possibility – taking informed action in the moment from real-time data…invariably followed by many questions – where do I begin? What do I need? What about the tech stack I have already acquired?

Migrate to CDP Private Cloud Base - A Step by Step Guide

Our recent blog discussed the four paths to get from legacy platforms to CDP Private Cloud Base. In this blog and accompanying video, we will deep dive into the mechanics of running an in-place upgrade from CDH5 or CDH6 to CDP Private Cloud Base. The overall upgrade follows a seven-step process illustrated below. In the video below we walk through a complete end to end upgrade of CDH to CDP Private Cloud Base.