Systems | Development | Analytics | API | Testing

Cloudera

Getting Started with Cloudera Stream Processing Community Edition

Cloudera has a strong track record of providing a comprehensive solution for stream processing. Cloudera Stream Processing (CSP), powered by Apache Flink and Apache Kafka, provides a complete stream management and stateful processing solution. In CSP, Kafka serves as the storage streaming substrate, and Flink as the core in-stream processing engine that supports SQL and REST interfaces.

The future of data architecture is hybrid: choosing your hybrid-first data strategy starts at Cloudera Now 2022

With all of the buzz around cloud computing, many companies have overlooked the importance of hybrid data. Many large enterprises went all-in on cloud without considering the costs and potential risks associated with a cloud-only approach. The truth is, the future of data architecture is all about hybrid.

An Introduction to Disaster Recovery with the Cloudera Data Platform

The previous decade has seen explosive growth in the integration of data and data-driven insight into a company’s ability to operate effectively, yielding an ever-growing competitive advantage to those that do it well. Our customers have become accustomed to the speed of decision making that comes from that insight. Data is integral for both long-term strategy and day-to-day, or even minute-to-minute operation.

How to Use Apache Iceberg in CDP's Open Lakehouse

In June 2022, Cloudera announced the general availability of Apache Iceberg in the Cloudera Data Platform (CDP). Iceberg is a 100% open-table format, developed through the Apache Software Foundation, which helps users avoid vendor lock-in and implement an open lakehouse. The general availability covers Iceberg running within some of the key data services in CDP, including Cloudera Data Warehouse (CDW), Cloudera Data Engineering (CDE), and Cloudera Machine Learning (CML).

Introducing Applied Machine Learning Prototypes

Applied Machine Learning Prototypes (AMPs) are open source projects that will fundamentally change the way data scientists build, deploy, and monitor ML models. These fully-developed prototypes are built around common industry use cases — like Churn Prediction Monitoring, Anomaly Detection, and more — and can be customized to give you significant head start. Available in Cloudera Machine Learning, AMPs are tested, trusted, and research backed by Fast Forward Labs.

Monitoring in Edge Flow Manager | Observability with Grafana

This video explains Edge Flow Manager (EFM) integration with Prometheus and Grafana. After installing and configuring Prometheus to scrape, EFM should also be configured to expose metrics. When the time series are in place, Grafana is also installed and configured to visualize exposed metrics. There are some EFM specific Grafana dashboards that are publicly available that can be easily downloaded and imported to Grafana. When everything is configured correctly agent specific dashboards can be accessed from the EFM UI.

Applying Fine Grained Security to Apache Spark

Apache Spark with its rich data APIs has been the processing engine of choice in a wide range of applications from data engineering to machine learning, but its security integration has been a pain point.t Many enterprise customers needi finer granularity of control, in particular at the column and row level (commonly known as Fine Grained Access Control or FGAC).

Fine-Tune Fair to Capacity Scheduler in Weight Mode

Cloudera Data Platform (CDP) unifies the technologies from Cloudera Enterprise Data Hub (CDH) and Hortonworks Data Platform (HDP). As part of that unification process, Cloudera merged the YARN Scheduler functionality from the legacy platforms, creating a Capacity Scheduler that better services all customers. In merging this scheduler functionality, Cloudera significantly reduced the time and effort to migrate from CDH and HDP.

Industry Impact | Data-Driven Digital Transformation

Data is more than ones and zeroes. If you can put it to work, data has the power to transform your entire company, even your entire industry. With more than 2000 customers in over 85 countries, Cloudera is helping companies across industries generate more revenue, build new products and understand their customers at scale and speed.