Systems | Development | Analytics | API | Testing

Latest Posts

NVIDIA RAPIDS in Cloudera Machine Learning

In the previous blog post in this series, we walked through the steps for leveraging Deep Learning in your Cloudera Machine Learning (CML) projects. This year, we expanded our partnership with NVIDIA, enabling your data teams to dramatically speed up compute processes for data engineering and data science workloads with no code changes using RAPIDS AI.

Streaming Market Data with Flink SQL Part II: Intraday Value-at-Risk

This article is the second in a multipart series to showcase the power and expressibility of FlinkSQL applied to market data. In case you missed it, part I starts with a simple case of calculating streaming VWAP. Code and data for this series are available on github. Speed matters in financial markets. Whether the goal is to maximize alpha or minimize exposure, financial technologists invest heavily in having the most up-to-date insights on the state of the market and where it is going.

The value of CDP Public Cloud over legacy Hadoop-on-IaaS implementations

Prior the introduction of CDP Public Cloud, many organizations that wanted to leverage CDH, HDP or any other on-prem Hadoop runtime in the public cloud had to deploy the platform in a lift-and-shift fashion, commonly known as “Hadoop-on-IaaS” or simply the IaaS model.

Key considerations when making a decision on a Cloud Data Warehouse

Making a decision on a cloud data warehouse is a big deal. Beyond there being a number of choices each with very different strengths, the parameters for your decision have also changed. Modernizing your data warehousing experience with the cloud means moving from dedicated, on-premises hardware focused on traditional relational analytics on structured data to a modern platform.

Accelerate Moving to CDP with Workload Manager

Since my last blog, What you need to know to begin your journey to CDP, we received many requests for a tool from Cloudera to analyze the workloads and help upgrade or migrate to Cloudera Data Platform (CDP). The good news is Cloudera has a tried and tested tool, Workload Manager (WM) that meets your needs. WM saves time and reduces risks during upgrades or migrations.

5 Factors to Consider When Choosing a Stream Processing Engine

Are you using the right stream processing engine for the job at hand? You might think you are—and you very well might be!—but have you really examined the stream processing engines out there in a side-by-side comparison to make sure? Our Choose the Right Stream Processing Engine for Your Data Needs whitepaper makes those comparisons for you, so you can quickly and confidently determine which engine best meets your key business requirements.

Automating CDP Private Cloud Installations with Ansible

The introduction of CDP Public Cloud has dramatically reduced the time in which you can be up and running with Cloudera’s latest technologies, be it with containerised Data Warehouse, Machine Learning, Operational Database or Data Engineering experiences or the multi-purpose VM-based Data Hub style of deployment.

cdpcurl: Low-Level CDP API Access

Cloudera Data Platform (CDP) provides an API that enables you to access CDP functionality from a script, or to integrate CDP features with an application. In practice you can use the CDP API to script repetitive tasks, manage CDP resources, or even create custom applications. You can learn more about the API in its official documentation. There are multiple ways to access the API, including through a dedicated CLI, through a Java SDK, and through a low-level tool called cdpcurl.

Quantifying the value of multi-cloud deployment strategies with CDP Public Cloud

In this article, I will be focusing on the contribution that a multi-cloud strategy has towards these value drivers, and address a question that I regularly get from clients: Is there a quantifiable benefit to a multi-cloud deployment? That question is typically being asked when I explain the ability to leverage container technology that offers a consistent deployment environment across multiple clouds and form factors (public, private, or hybrid cloud).