Systems | Development | Analytics | API | Testing

Latest Posts

Streaming Market Data with Flink SQL Part I: Streaming VWAP

Speed matters in financial markets. Whether the goal is to maximize alpha or minimize exposure, financial technologists invest heavily in having the most up-to-date insights on the state of the market and where it is going. Event-driven and streaming architectures enable complex processing on market events as they happen, making them a natural fit for financial market applications.

Driving Agility and Scalability through Smart Data

Last year presented business and organizational challenges that hadn’t been seen in a century and the troubling fact is that the challenges applied pains and gains unequally across industry segments. While brick-and-mortar retail was crushed a year ago with mandated store closures, digital commerce retailers realized ten years of digital sales penetration in only three months.

Managing Python dependencies for Spark workloads in Cloudera Data Engineering

Apache Spark is now widely used in many enterprises for building high-performance ETL and Machine Learning pipelines. If the users are already familiar with Python then PySpark provides a python API for using Apache Spark. When users work with PySpark they often use existing python and/or custom Python packages in their program to extend and complement Apache Spark’s functionality. Apache Spark provides several options to manage these dependencies.

The New Releases of Apache NiFi in Public Cloud and Private Cloud

Cloudera released a lot of things around Apache NiFi recently! We just released Cloudera Flow Management (CFM) 2.1.1 that provides Apache NiFi on top of Cloudera Data Platform (CDP) 7.1.6. This major release provides the latest and greatest of Apache NiFi as it includes Apache NiFi 1.13.2 and additional improvements, bug fixes, components, etc. Cloudera also released CDP 7.2.9 on all three major cloud platforms, and it also brings Flow Management on DataHub with Apache NiFi 1.13.2 and more.

Cable Companies Are Growing Up

Cable and Satellite companies in the US have emerged from a decade of acquisitions, consolidation and shakeout and are beginning to assert themselves as full service providers in the communications and media space. With Comcast just announcing its new suite of cellphone plans this month, and Charter, Altice and Dish ramping up their offerings, the Big Three in wireless – AT&T, Verizon and T-Mobile/Sprint – are looking over their shoulders.

Converting HBase ACLs to Ranger policies

CDP is using Apache Ranger for data security management. If you wish to utilize Ranger to have a centralized security administration, HBase ACLs need to be migrated to policies. This can be done via the Ranger webUI, accessible from Cloudera Manager. But first, let’s take a quick overview of HBase method for access control.

HDFS Data Encryption at Rest on Cloudera Data Platform

Encryption of Data at Rest is a highly desirable or sometimes mandatory requirement for data platforms in a range of industry verticals including HealthCare, Financial & Government organizations. The capability increases security and protects sensitive data from various kinds of attack that could be internal or external to the platform.

Apache Ozone and Dense Data Nodes

Today’s enterprise data analytics teams are constantly looking to get the best out of their platforms. Storage plays one of the most important roles in the data platforms strategy, it provides the basis for all compute engines and applications to be built on top of it. Businesses are also looking to move to a scale-out storage model that provides dense storages along with reliability, scalability, and performance.

Drinking our own champagne - Cloudera upgrades to CDP Private Cloud

Like most of our customers, Cloudera’s internal operations rely heavily on data. For more than a decade, Cloudera has built internal tools and data analysis primarily on a single production CDH cluster. This cluster runs workloads for every department – from real-time user interfaces for Support to providing recommendations in the Cloudera Data Platform (CDP) Upgrade Advisor to analyzing our business and closing our books.