April 2021

Managing Python dependencies for Spark workloads in Cloudera Data Engineering

Apr 30, 2021 By Vijay Karthikeyan In Cloudera

Apache Spark is now widely used in many enterprises for building high-performance ETL and Machine Learning pipelines. If the users are already familiar with Python then PySpark provides a python API for using Apache Spark. When users work with PySpark they often use existing python and/or custom Python packages in their program to extend and complement Apache Spark’s functionality. Apache Spark provides several options to manage these dependencies.

Read Post

Cloudera

Read more about Managing Python dependencies for Spark workloads in Cloudera Data Engineering

Future of Data Meetup: Exploring Data and Creating Interactive Dashboards in the Cloud

Apr 29, 2021 By Cloudera In Cloudera

In this meetup, we’re going to once again put ourselves in the shoes of an electric car manufacturer that is deploying a recently developed electric motor out into their new cars. We’re going to show how to explore some data that has been previously collected through various different sources and stored into Apache Hive within a data warehouse, with the goal of tracking down a specific set of potentially defective parts. We’ll then take the results of this data exploration and create an interactive dashboard that presents our results in a visually appealing way using a BI tool that’s integrated right into the same data warehouse.

View Video

Cloudera

Read more about Future of Data Meetup: Exploring Data and Creating Interactive Dashboards in the Cloud

Fast Forward Live: Few-Shot Text Classification

Apr 29, 2021 By Cloudera In Cloudera

Join us for this month's Machine Learning research discussion with Cloudera Fast Forward Labs. We will discuss few-shot text classification - including a live demo and Q&A. This is an applied research report by Cloudera Fast Forward. We write reports about emerging technologies. Accompanying each report are working prototypes or code that exhibits the capabilities of the algorithm and offer detailed technical advice on its practical application.

View Video

Cloudera

Analytics
BI

Read more about Fast Forward Live: Few-Shot Text Classification

The New Releases of Apache NiFi in Public Cloud and Private Cloud

Apr 29, 2021 By Pierre Villard In Cloudera

Cloudera released a lot of things around Apache NiFi recently! We just released Cloudera Flow Management (CFM) 2.1.1 that provides Apache NiFi on top of Cloudera Data Platform (CDP) 7.1.6. This major release provides the latest and greatest of Apache NiFi as it includes Apache NiFi 1.13.2 and additional improvements, bug fixes, components, etc. Cloudera also released CDP 7.2.9 on all three major cloud platforms, and it also brings Flow Management on DataHub with Apache NiFi 1.13.2 and more.

Read Post

Cloudera

Read more about The New Releases of Apache NiFi in Public Cloud and Private Cloud

Cable Companies Are Growing Up

Apr 27, 2021 By Anthony Behan In Cloudera

Cable and Satellite companies in the US have emerged from a decade of acquisitions, consolidation and shakeout and are beginning to assert themselves as full service providers in the communications and media space. With Comcast just announcing its new suite of cellphone plans this month, and Charter, Altice and Dish ramping up their offerings, the Big Three in wireless – AT&T, Verizon and T-Mobile/Sprint – are looking over their shoulders.

Read Post

Cloudera

Read more about Cable Companies Are Growing Up

Converting HBase ACLs to Ranger policies

Apr 26, 2021 By Norbert Kalmar In Cloudera

CDP is using Apache Ranger for data security management. If you wish to utilize Ranger to have a centralized security administration, HBase ACLs need to be migrated to policies. This can be done via the Ranger webUI, accessible from Cloudera Manager. But first, let’s take a quick overview of HBase method for access control.

Read Post

Cloudera

Read more about Converting HBase ACLs to Ranger policies

Cloudera Data Platform (CDP) Private Cloud on Red Hat OpenShift

Apr 23, 2021 By Cloudera In Cloudera

Learn how Cloudera and Red Hat help enterprise companies securely manage the complete data lifecycle, putting data to work faster and reducing time to value. Cloudera Data Platform (CDP) Private Cloud on Red Hat® OpenShift® aggregates and visualizes data to derive actionable insights in a secure, hybrid, and open-source environment.

View Video

Cloudera

Analytics
BI

Read more about Cloudera Data Platform (CDP) Private Cloud on Red Hat OpenShift

HDFS Data Encryption at Rest on Cloudera Data Platform

Apr 23, 2021 By Arun Kumar Natva In Cloudera

Encryption of Data at Rest is a highly desirable or sometimes mandatory requirement for data platforms in a range of industry verticals including HealthCare, Financial & Government organizations. The capability increases security and protects sensitive data from various kinds of attack that could be internal or external to the platform.

Read Post

Cloudera

Read more about HDFS Data Encryption at Rest on Cloudera Data Platform

Apache Ozone and Dense Data Nodes

Apr 22, 2021 By Karthik Krishnamoorthy In Cloudera

Today’s enterprise data analytics teams are constantly looking to get the best out of their platforms. Storage plays one of the most important roles in the data platforms strategy, it provides the basis for all compute engines and applications to be built on top of it. Businesses are also looking to move to a scale-out storage model that provides dense storages along with reliability, scalability, and performance.

Read Post

Cloudera

Read more about Apache Ozone and Dense Data Nodes

Future of Data Meetup: Nice to Meet You, NiFi!

Apr 21, 2021 By Cloudera In Cloudera

You asked for and we are delivering the third in our “Hello:“ series of introductory “Big Data” topics. Our next meetup covers using Apache NiFi. Lots of people want to be a data scientist... but what good is machine learning, artificial intelligence or advanced analytics if you don’t have data? Getting data is incredibly important, but getting data in real time or near real time helps you give near real time insight.

View Video

Cloudera

Analytics
BI

Read more about Future of Data Meetup: Nice to Meet You, NiFi!

Drinking our own champagne - Cloudera upgrades to CDP Private Cloud

Apr 21, 2021 By Alan Jackoway In Cloudera

Like most of our customers, Cloudera’s internal operations rely heavily on data. For more than a decade, Cloudera has built internal tools and data analysis primarily on a single production CDH cluster. This cluster runs workloads for every department – from real-time user interfaces for Support to providing recommendations in the Cloudera Data Platform (CDP) Upgrade Advisor to analyzing our business and closing our books.

Read Post

Cloudera

Read more about Drinking our own champagne - Cloudera upgrades to CDP Private Cloud

What is Streaming Analytics?

Apr 20, 2021 By Laura Chu In Cloudera

What is Streaming Analytics? Streaming Analytics is a type of data analysis that processes data streams for real-time analytics. It continuously processes data from multiple streams and performs simple calculations to complex event processing for delivering sophisticated use cases. The primary purpose is to present the most up-to-date operational events for the user to stay on top of the business needs and take action as changes happen in real-time.

Read Post

Cloudera

Read more about What is Streaming Analytics?

Deep Learning with Nvidia GPUs in Cloudera Machine Learning

Apr 19, 2021 By Brian Law In Cloudera

In our previous blog post in this series, we explored the benefits of using GPUs for data science workflows, and demonstrated how to set up sessions in Cloudera Machine Learning (CML) to access NVIDIA GPUs for accelerating Machine Learning Projects.

Read Post

Cloudera

Read more about Deep Learning with Nvidia GPUs in Cloudera Machine Learning

What's new in CDP Private Cloud Base 7.1.6?

Apr 15, 2021 By Karthik Krishnamoorthy In Cloudera

According to IDG, when customers consider updating to the latest release of a product, they expect new features, enhanced security, and better performance, but increasingly want a more streamlined upgrade process. With each new release of CDP Private Cloud, this is exactly what we strive to deliver. Along with a host of new features and capabilities, we are improving the upgrade process to be as painless as possible.

Read Post

Cloudera

Read more about What's new in CDP Private Cloud Base 7.1.6?

Cloudera Data Engineering - Integration steps to leverage spark on Kubernetes

Apr 14, 2021 By Harsh Shah In Cloudera

Cloudera Data Engineering is a serverless service for Cloudera Data Platform (CDP) that allows you to submit jobs to auto-scaling virtual clusters. CDE enables you to spend more time on your applications, and less time on infrastructure. CDE allows you to create, manage, and schedule Apache Spark jobs without the overhead of creating and maintaining Spark clusters.

Read Post

Cloudera

Read more about Cloudera Data Engineering - Integration steps to leverage spark on Kubernetes

No Data Loss and No Service Interruption - HDF to CFM Rolling Migration

Apr 14, 2021 By Andrew Lim In Cloudera

The blog “Migrating Apache NiFi Flows from HDF to CFM with Zero Downtime” detailed how many common NiFi dataflows can be easily migrated when the Hortonworks DataFlow and Cloudera Flow Management clusters are running side-by-side. But what if you lack the resources to run multiple NiFi clusters concurrently? Not a problem.

Read Post

Cloudera

Read more about No Data Loss and No Service Interruption - HDF to CFM Rolling Migration

5 Success Stories That Show the Value of Enterprise Data Cloud

Apr 13, 2021 By Hana Jeddy In Cloudera

What’s the fastest and easiest path towards powerful cloud-native analytics that are secure and cost-efficient? In our humble opinion, we believe that’s Cloudera Data Platform (CDP). And sure, we’re a little biased—but only because we’ve seen firsthand how CDP helps our customers realize the full benefits of public cloud.

Read Post

Cloudera

Read more about 5 Success Stories That Show the Value of Enterprise Data Cloud

10 Steps to Achieve Enterprise Machine Learning Success

Apr 13, 2021 By Santiago Giraldo In Cloudera

You’ve probably heard it more than once: Machine learning (ML) can take your digital transformation to another level. It’s a pie-in-the-sky statement that sounds great, right? And while you’d be forgiven for thinking that it might sound too good to be true, operational ML is, in fact, achievable and sustainable. You can get the very kind of ML you need to increase revenue and lower costs. To help teams work smarter and do things faster.

Read Post

Cloudera

Read more about 10 Steps to Achieve Enterprise Machine Learning Success

The Key to Unlocking IT Modernization's Power? Enterprise level Transformation

Apr 12, 2021 By Rob Carey In Cloudera

The United States Veterans Administration (VA) over the last decade underwent a massive enterprise-wide IT transformation, eliminating its fragmented shadow IT and adopting a centralized system capable of supporting the agency’s 400,000 employees and more effectively utilizing its $240 billion-plus annual budget. The result: A more reliable and modern IT environment that improves access, availability, and user experience -ultimately supporting the VA mission more effectively.

Read Post

Cloudera

Read more about The Key to Unlocking IT Modernization's Power? Enterprise level Transformation

Enabling NVIDIA GPUs to accelerate model development in Cloudera Machine Learning

Apr 10, 2021 By Peter Ableda In Cloudera

When working on complex, or rigorous enterprise machine learning projects, Data Scientists and Machine Learning Engineers experience various degrees of processing lag training models at scale. While model training on small data can typically take minutes, doing the same on large volumes of data can take hours or even weeks. To overcome this, practitioners often turn to NVIDIA GPUs to accelerate machine learning and deep learning workloads.

Read Post

Cloudera

Read more about Enabling NVIDIA GPUs to accelerate model development in Cloudera Machine Learning

Next Stop - Predicting on Data with Cloudera Machine Learning

Apr 9, 2021 By Robert Hryniewicz In Cloudera

This blog series follows the manufacturing and operations data lifecycle stages of an electric car manufacturer – typically experienced in large, data-driven manufacturing companies. The first blog introduced a mock vehicle manufacturing company, The Electric Car Company (ECC) and focused on Data Collection. The second blog dealt with creating and managing Data Enrichment pipelines. The third video in the series highlighted Reporting and Data Visualization.

Read Post

Cloudera

Read more about Next Stop - Predicting on Data with Cloudera Machine Learning

Seven Common Challenges Fueling Data Warehouse Modernisation

Apr 9, 2021 By Daniel Hand In Cloudera

Enterprise data warehouse platform owners face a number of common challenges. In this article, we look at seven challenges, explore the impacts to platform and business owners and highlight how a modern data warehouse can address them.

Read Post

Cloudera

Read more about Seven Common Challenges Fueling Data Warehouse Modernisation

Building Automated ML Pipelines in Cloudera Machine Learning

Apr 8, 2021 By Cloudera In Cloudera

In this video, we'll walk through an example on how you can use Cloudera Machine Learning to run some python code that creates specific Machine Learning models. We’ll then go through some features within Cloudera Machine Learning such as job scheduling and model deployments to see how you can do some more advanced machine development operations!

View Video

Cloudera

Read more about Building Automated ML Pipelines in Cloudera Machine Learning

What's new in CDP Public Cloud?

Apr 8, 2021 By Cloudera In Cloudera

Join the CDP Public Cloud team for a live chat about what's new in CDP Public Cloud - we'll chat about some of our favorite new features, including our recent Google Cloud launch.

View Video

Cloudera

Read more about What's new in CDP Public Cloud?

Enabling kubectl for CDE

Apr 7, 2021 By Cloudera In Cloudera

The kubectl tool provides direct administrative access to the Kubernetes cluster underlying a CDE service, which is useful for troubleshooting, among other things. This video will demonstrate how to set up kubectl access. To enable kubectl, we will need a couple of prerequisites. We wiil need the kubeconfig file from the CDE service. We will need to get and authorize the IAM user, and then need to make sure that everything is set up correctly, both for kubectl and some other tools like k9s.

View Video

Cloudera

Read more about Enabling kubectl for CDE

Cloudera Machine Learning Overview

Apr 7, 2021 By Cloudera In Cloudera

A complete overview of Cloudera Machine Learning (CML) on Cloudera Data Platform. This video covers all CML features for data science workflows.

View Video

Cloudera

Read more about Cloudera Machine Learning Overview

The Journey to Understanding your Insurance Customers

Apr 7, 2021 By Monique Hesseling In Cloudera

Insurance carriers have a unique opportunity: They have access to powerful technologies and a wealth of information that can help them to better understand their customers and provide an enhanced customer experience.

Read Post

Cloudera

Read more about The Journey to Understanding your Insurance Customers

Cloudera Honored With 5-Star Rating in the 2021 CRN Partner Program Guide

Apr 5, 2021 By Debbie D'Souza In Cloudera

Cloudera is being acknowledged by CRN®, a brand of The Channel Company, in its 2021 Partner Program Guide. This annual guide provides a conclusive list of the most distinguished partner programs from leading technology companies that provide products and services through the IT Channel. The 5-Star rating is awarded to an exclusive group of companies that offer solution providers the best of the best, going above and beyond in their partner programs.

Read Post

Cloudera

Read more about Cloudera Honored With 5-Star Rating in the 2021 CRN Partner Program Guide

Hybrid Cloud and Strategic Data Use Accelerate State, Army Missions

Apr 2, 2021 By Nasheb Ismaily In Cloudera

Some of the most forward-operational elements of the United States federal government are making strides in leveraging data through hybrid cloud environments—and they’re constantly evaluating progress and recalibrating their approaches along the way. At agencies including the Army and the State Department, work is well underway to find ways of employing emerging technologies that build on cloud services and data optimization to realize new levels of effectiveness.

Read Post

Cloudera

Read more about Hybrid Cloud and Strategic Data Use Accelerate State, Army Missions

Fast Forward Live: Representation Learning & Image Analysis

Apr 1, 2021 By Cloudera In Cloudera

Good representations of data (e.g., text, images) are critical for solving many tasks (e.g., search or recommendations). But what exactly are representations, how can they be built and why are deep learning models useful? In this livestream, we will discuss these questions from a software engineering perspective and walk through a live example!

View Video

Cloudera

Analytics
BI

Read more about Fast Forward Live: Representation Learning & Image Analysis

Systems | Development | Analytics | API | Testing

April 2021

Managing Python dependencies for Spark workloads in Cloudera Data Engineering

Future of Data Meetup: Exploring Data and Creating Interactive Dashboards in the Cloud

Fast Forward Live: Few-Shot Text Classification

The New Releases of Apache NiFi in Public Cloud and Private Cloud

Cable Companies Are Growing Up

Converting HBase ACLs to Ranger policies

Cloudera Data Platform (CDP) Private Cloud on Red Hat OpenShift

HDFS Data Encryption at Rest on Cloudera Data Platform

Apache Ozone and Dense Data Nodes

Future of Data Meetup: Nice to Meet You, NiFi!

Drinking our own champagne - Cloudera upgrades to CDP Private Cloud

What is Streaming Analytics?

Deep Learning with Nvidia GPUs in Cloudera Machine Learning

What's new in CDP Private Cloud Base 7.1.6?

Cloudera Data Engineering - Integration steps to leverage spark on Kubernetes

No Data Loss and No Service Interruption - HDF to CFM Rolling Migration

5 Success Stories That Show the Value of Enterprise Data Cloud

10 Steps to Achieve Enterprise Machine Learning Success

The Key to Unlocking IT Modernization's Power? Enterprise level Transformation

Enabling NVIDIA GPUs to accelerate model development in Cloudera Machine Learning

Next Stop - Predicting on Data with Cloudera Machine Learning

Seven Common Challenges Fueling Data Warehouse Modernisation

Building Automated ML Pipelines in Cloudera Machine Learning

What's new in CDP Public Cloud?

Enabling kubectl for CDE

Cloudera Machine Learning Overview

The Journey to Understanding your Insurance Customers

Cloudera Honored With 5-Star Rating in the 2021 CRN Partner Program Guide

Hybrid Cloud and Strategic Data Use Accelerate State, Army Missions

Fast Forward Live: Representation Learning & Image Analysis

Monthly Archive

Follow Us