Systems | Development | Analytics | API | Testing

Cloudera

Developing a Basic Web Application using an Operational DB on CDP

In this video, you'll see a simple demo on how you can build a web application on top of a Cloudera Operational Database. We'll leverage the Apache Phoenix integration to easily write SQL statements against our database and use the python flask library to power the back end API calls. The web application will be hosted within Cloudera Machine Learning, showcasing some of the benefits of having your data within a hybrid data platform.

Introducing Self-Service, No-Code Airflow Authoring UI in Cloudera Data Engineering

Airflow has been adopted by many Cloudera Data Platform (CDP) customers in the public cloud as the next generation orchestration service to setup and operationalize complex data pipelines. Today, customers have deployed 100s of Airflow DAGs in production performing various data transformation and preparation tasks, with differing levels of complexity.

Apache Ozone - A High Performance Object Store for CDP Private Cloud

As organizations wrangle with the explosive growth in data volume they are presented with today, efficiency and scalability of storage become pivotal to operating a successful data platform for driving business insight and value. Apache Ozone is a distributed, scalable, and high performance object store, available with Cloudera Data Platform Private Cloud.

Announcing CDP Public Cloud Regional Control Plane in Australia and Europe

We’re excited to announce CDP Public Cloud Regional Control Plane in Australia and Europe. This addition will extend CDP Hybrid capabilities to customers in industries with strict data protection requirements by allowing them to govern their data entirely in-region.

Your Parents Still Don't Know What a Hashtag Is. Let's Teach Them the Basics of Machine Learning and Streaming Data

Quite often, the digital natives of the family — you — have to explain to the analog fans of the family what PDFs are, how to use a hashtag, a phone camera, or a remote. Imagine if you had to explain what machine learning is and how to use it. There’s no need to panic. Cloudera produced a series of ebooks — Production Machine Learning For Dummies, Apache NiFi For Dummies, and Apache Flink For Dummies (coming soon) — to help simplify even the most complex tech topics.

How to Turn your Data Center into a True Private Cloud

According to Domo, on average, every human created at least 1.7 MB of data per second in 2020. That’s a lot of data. For enterprises the net result is an intricate data management challenge that’s not about to get any less complex anytime soon. Enterprises need to find a way of getting insights from this vast treasure trove of data into the hands of the people that need it. For relatively low amounts of data, public cloud is a possible path for some organizations.

What is new in Cloudera Streaming Analytics 1.5?

At the end of May, we released the second version of Cloudera SQL Stream Builder (SSB) as part of Cloudera Streaming Analytics (CSA). Among other features, the 1.4 version of CSA surfaced the expressivity of Flink SQL in SQL Stream Builder via adding DDL and Catalog support, and it greatly improved the integration with other Cloudera Data Platform components, for example via enabling stream enrichment from Hive and Kudu.

Accelerate Your Data Mesh in the Cloud with Cloudera Data Engineering and Modak Nabu

Modak, a leading provider of modern data engineering solutions, is now a certified solution partner with Cloudera. Customers can seamlessly automate migration to Cloudera’s cloud-based enterprise platform CDP from on-prem deployments and dynamically auto-scale cloud services with Cloudera Data Engineering (CDE)’s integration with Modak Nabu™.

Admission Control Architecture for Cloudera Data Platform

Apache Impala is a massively parallel in-memory SQL engine supported by Cloudera designed for Analytics and ad hoc queries against data stored in Apache Hive, Apache HBase and Apache Kudu tables. Supporting powerful queries and high levels of concurrency Impala can use significant amounts of cluster resources. In multi-tenant environments this can inadvertently impact adjacent services such as YARN, HBase, and even HDFS.