October 2021

Live with Cloudera: Running NiFi flows in a Hybrid Data Cloud

Oct 29, 2021 By Cloudera In Cloudera

When should you run Apache NiFi flows on prem and when is it better in the public cloud? Join this live session to hear arguments for both. You also have the opportunity for a live Q&A and to see a demo for how to run your NiFi flows in a Hybrid Data Cloud.

View Video

Cloudera

Analytics
BI

Read more about Live with Cloudera: Running NiFi flows in a Hybrid Data Cloud

High Availability (Multi-AZ) for CDP Operational Database

Oct 29, 2021 By Andor Molnar In Cloudera

CDP Operational Database (COD) is an autonomous transactional database powered by Apache HBase and Apache Phoenix. It is one of the main Data Services that runs on Cloudera Data Platform (CDP) Public Cloud. You can access COD right from your CDP console. With COD, application developers can now leverage the power of HBase and Phoenix without the overheads that are often related to deployment and management.

Read Post

Cloudera

Read more about High Availability (Multi-AZ) for CDP Operational Database

Live with Cloudera: Flink Forward in Review

Oct 28, 2021 By Cloudera In Cloudera

What happened at Flink Forward Global 2021 this week? Tune in for a live discussion with Kenny Gorman, Márton Balassi, and Erik Beebe to discuss the important moments. You also might catch a live demo. Don't miss it!

View Video

Cloudera

Analytics
BI

Read more about Live with Cloudera: Flink Forward in Review

Commercial Lines Insurance- the End of the Line for All Data

Oct 28, 2021 By Monique Hesseling In Cloudera

I’ve had the pleasure to participate in a few Commercial Lines insurance industry events recently and as a prior Commercial Lines insurer myself, I am thrilled with the progress the industry is making using data and analytics. However, I do not think Commercial Lines insurance gets the credit it deserves for the industry-leading role it has played in analytics. Commercial Lines truly is an “uber industry” with respect to data.

Read Post

Cloudera

Read more about Commercial Lines Insurance- the End of the Line for All Data

The Ultimate Map to finding Halloween candy surplus

Oct 26, 2021 By Jacob Bengtson In Cloudera

As Halloween night quickly approaches, there is only one question on every kid’s mind: how can I maximize my candy haul this year with the best possible candy? This kind of question lends itself perfectly to data science approaches that enable quick and intuitive analysis of data across multiple sources.

Read Post

Cloudera

Read more about The Ultimate Map to finding Halloween candy surplus

Cloudera Machine Learning Workspace Provisioning Pre-Flight Checks

Oct 25, 2021 By Peter Ableda In Cloudera

There are many good uses of data. With data, we can monitor our business, the overall business, or specific business units. We can segment based on the customer verticals or whether they run in the public or private cloud. We can understand customers better, see usage patterns and main consumption drivers. We can find customer pain points, see where they get stuck, and understand how different bugs affect them.

Read Post

Cloudera

Read more about Cloudera Machine Learning Workspace Provisioning Pre-Flight Checks

New Features in Cloudera Streams Messaging Public Cloud 7.2.12

Oct 25, 2021 By Joseph Niemiec In Cloudera

With the launch of the Cloudera Public Cloud 7.2.12, the Streams Messaging for Data Hub deployments have gotten some interesting new features! From this release, Streams Messaging templates will support scaling with automatic rebalancing allowing you to grow or shrink your Apache Kafka cluster based on demand.

Read Post

Cloudera

Read more about New Features in Cloudera Streams Messaging Public Cloud 7.2.12

How to Automate Apache NiFi Data Flow Deployments in the Public Cloud

Oct 22, 2021 By Michael Kohs In Cloudera

With the latest release of Cloudera DataFlow for the Public Cloud (CDF-PC) we added new CLI capabilities that allow you to automate data flow deployments, making it easier than ever before to incorporate Apache NiFi flow deployments into your CI/CD pipelines. This blog post walks you through the data flow development lifecycle and how you can use APIs in CDP Public Cloud to fully automate your flow deployments.

Read Post

Cloudera

Read more about How to Automate Apache NiFi Data Flow Deployments in the Public Cloud

How to Gain Greater Confidence in your Climate Risk Models

Oct 20, 2021 By Joe Rodriguez In Cloudera

We are just over one week until the UN Climate Change Conference of the Parties, COP26 convenes in Glasgow. As governments gather to push forward climate and renewable energy initiatives aligned with the Paris Agreement and the UN Framework Convention on Climate Change, financial institutions and asset managers will monitor the event with keen interest.

Read Post

Cloudera

Read more about How to Gain Greater Confidence in your Climate Risk Models

Developing a Basic Web Application using an Operational DB on CDP

Oct 19, 2021 By Cloudera In Cloudera

In this video, you'll see a simple demo on how you can build a web application on top of a Cloudera Operational Database. We'll leverage the Apache Phoenix integration to easily write SQL statements against our database and use the python flask library to power the back end API calls. The web application will be hosted within Cloudera Machine Learning, showcasing some of the benefits of having your data within a hybrid data platform.

View Video

Cloudera

Analytics
BI

Read more about Developing a Basic Web Application using an Operational DB on CDP

Introducing Self-Service, No-Code Airflow Authoring UI in Cloudera Data Engineering

Oct 19, 2021 By Shaun Ahmadian In Cloudera

Airflow has been adopted by many Cloudera Data Platform (CDP) customers in the public cloud as the next generation orchestration service to setup and operationalize complex data pipelines. Today, customers have deployed 100s of Airflow DAGs in production performing various data transformation and preparation tasks, with differing levels of complexity.

Read Post

Cloudera

Read more about Introducing Self-Service, No-Code Airflow Authoring UI in Cloudera Data Engineering

Apache Ozone - A High Performance Object Store for CDP Private Cloud

Oct 15, 2021 By Rakesh Radhakrishan In Cloudera

As organizations wrangle with the explosive growth in data volume they are presented with today, efficiency and scalability of storage become pivotal to operating a successful data platform for driving business insight and value. Apache Ozone is a distributed, scalable, and high performance object store, available with Cloudera Data Platform Private Cloud.

Read Post

Cloudera

Read more about Apache Ozone - A High Performance Object Store for CDP Private Cloud

Harness the Power of AND

Oct 14, 2021 By Cloudera In Cloudera

AND has the power to move you. And the hybrid data cloud will take you there. Any cloud, with any analytics, and any data across your entire business.

View Video

Cloudera

Analytics
BI

Read more about Harness the Power of AND

Announcing CDP Public Cloud Regional Control Plane in Australia and Europe

Oct 14, 2021 By David Moxey In Cloudera

We’re excited to announce CDP Public Cloud Regional Control Plane in Australia and Europe. This addition will extend CDP Hybrid capabilities to customers in industries with strict data protection requirements by allowing them to govern their data entirely in-region.

Read Post

Cloudera

Read more about Announcing CDP Public Cloud Regional Control Plane in Australia and Europe

Your Parents Still Don't Know What a Hashtag Is. Let's Teach Them the Basics of Machine Learning and Streaming Data

Oct 13, 2021 By Cloudera Contributors In Cloudera

Quite often, the digital natives of the family — you — have to explain to the analog fans of the family what PDFs are, how to use a hashtag, a phone camera, or a remote. Imagine if you had to explain what machine learning is and how to use it. There’s no need to panic. Cloudera produced a series of ebooks — Production Machine Learning For Dummies, Apache NiFi For Dummies, and Apache Flink For Dummies (coming soon) — to help simplify even the most complex tech topics.

Read Post

Cloudera

Read more about Your Parents Still Don't Know What a Hashtag Is. Let's Teach Them the Basics of Machine Learning and Streaming Data

How to Turn your Data Center into a True Private Cloud

Oct 13, 2021 By Wim Stoop In Cloudera

According to Domo, on average, every human created at least 1.7 MB of data per second in 2020. That’s a lot of data. For enterprises the net result is an intricate data management challenge that’s not about to get any less complex anytime soon. Enterprises need to find a way of getting insights from this vast treasure trove of data into the hands of the people that need it. For relatively low amounts of data, public cloud is a possible path for some organizations.

Read Post

Cloudera

Read more about How to Turn your Data Center into a True Private Cloud

What is new in Cloudera Streaming Analytics 1.5?

Oct 12, 2021 By Marton Balassi In Cloudera

At the end of May, we released the second version of Cloudera SQL Stream Builder (SSB) as part of Cloudera Streaming Analytics (CSA). Among other features, the 1.4 version of CSA surfaced the expressivity of Flink SQL in SQL Stream Builder via adding DDL and Catalog support, and it greatly improved the integration with other Cloudera Data Platform components, for example via enabling stream enrichment from Hive and Kudu.

Read Post

Cloudera

Read more about What is new in Cloudera Streaming Analytics 1.5?

Accelerate Your Data Mesh in the Cloud with Cloudera Data Engineering and Modak Nabu

Oct 11, 2021 By Shaun Ahmadian In Cloudera

Modak, a leading provider of modern data engineering solutions, is now a certified solution partner with Cloudera. Customers can seamlessly automate migration to Cloudera’s cloud-based enterprise platform CDP from on-prem deployments and dynamically auto-scale cloud services with Cloudera Data Engineering (CDE)’s integration with Modak Nabu™.

Read Post

Cloudera

Read more about Accelerate Your Data Mesh in the Cloud with Cloudera Data Engineering and Modak Nabu

Admission Control Architecture for Cloudera Data Platform

Oct 8, 2021 By Niel Dunnage In Cloudera

Apache Impala is a massively parallel in-memory SQL engine supported by Cloudera designed for Analytics and ad hoc queries against data stored in Apache Hive, Apache HBase and Apache Kudu tables. Supporting powerful queries and high levels of concurrency Impala can use significant amounts of cluster resources. In multi-tenant environments this can inadvertently impact adjacent services such as YARN, HBase, and even HDFS.

Read Post

Cloudera

Read more about Admission Control Architecture for Cloudera Data Platform

Processing DICOM Files With Spark on CDP Hybrid Cloud

Oct 7, 2021 By Cloudera In Cloudera

In this video, you will see how you can use PySpark to process medical images from an MRI and convert them from DICOM format to PNG. The data is read from and written to AWS S3 and we leverage numpy and the pydicom libraries to do the data transformation. We are using data from the "RSNA-MICCAI Brain Tumor Radiogenomic Classification" Kaggle competition but this approach can be used for general purpose DICOM processing.

View Video

Cloudera

Read more about Processing DICOM Files With Spark on CDP Hybrid Cloud

How Cloudera DataFlow Enables Successful Data Mesh Architectures

Oct 7, 2021 By Andreas Skouloudis In Cloudera

In this blog, I will demonstrate the value of Cloudera DataFlow (CDF), the edge-to-cloud streaming data platform available on the Cloudera Data Platform (CDP), as a Data integration and Democratization fabric. Within the context of a data mesh architecture, I will present industry settings / use cases where the particular architecture is relevant and highlight the business value that it delivers against business and technology areas.

Read Post

Cloudera

Read more about How Cloudera DataFlow Enables Successful Data Mesh Architectures

Struggling to Manage your Multi-Tenant Environments? Use Chargeback!

Oct 5, 2021 By Shirish Deshmukh In Cloudera

If your organization is using multi-tenant big data clusters (and everyone should be), do you know the usage and cost efficiency of resources in the cluster by tenants? A chargeback or showback model allows IT to determine costs and resource usage by the actual analytic users in the multi-tenant cluster, instead of attributing those to the platform (“overhead’) or IT department. This allows you to know the individual costs per tenant and set limits in order to control overall costs.

Read Post

Cloudera

Read more about Struggling to Manage your Multi-Tenant Environments? Use Chargeback!

An Introduction to Ranger RMS

Oct 5, 2021 By Kiran Anand In Cloudera

Cloudera Data Platform (CDP) supports access controls on tables and columns, as well as on files and directories via Apache Ranger since its first release. It is common to have different workloads using the same data – some require authorizations at the table level (Apache Hive queries) and others at the underlying files (Apache Spark jobs). Unfortunately, in such instances you would have to create and maintain separate Ranger policies for both Hive and HDFS, that correspond to each other.

Read Post

Cloudera

Read more about An Introduction to Ranger RMS

Systems | Development | Analytics | API | Testing

October 2021

Live with Cloudera: Running NiFi flows in a Hybrid Data Cloud

High Availability (Multi-AZ) for CDP Operational Database

Live with Cloudera: Flink Forward in Review

Commercial Lines Insurance- the End of the Line for All Data

The Ultimate Map to finding Halloween candy surplus

Cloudera Machine Learning Workspace Provisioning Pre-Flight Checks

New Features in Cloudera Streams Messaging Public Cloud 7.2.12

How to Automate Apache NiFi Data Flow Deployments in the Public Cloud

How to Gain Greater Confidence in your Climate Risk Models

Developing a Basic Web Application using an Operational DB on CDP

Introducing Self-Service, No-Code Airflow Authoring UI in Cloudera Data Engineering

Apache Ozone - A High Performance Object Store for CDP Private Cloud

Harness the Power of AND

Announcing CDP Public Cloud Regional Control Plane in Australia and Europe

Your Parents Still Don't Know What a Hashtag Is. Let's Teach Them the Basics of Machine Learning and Streaming Data

How to Turn your Data Center into a True Private Cloud

What is new in Cloudera Streaming Analytics 1.5?

Accelerate Your Data Mesh in the Cloud with Cloudera Data Engineering and Modak Nabu

Admission Control Architecture for Cloudera Data Platform

Processing DICOM Files With Spark on CDP Hybrid Cloud

How Cloudera DataFlow Enables Successful Data Mesh Architectures

Struggling to Manage your Multi-Tenant Environments? Use Chargeback!

An Introduction to Ranger RMS

Monthly Archive

Follow Us