Systems | Development | Analytics | API | Testing

Latest Posts

Cloudera Data Warehouse Demonstrates Best-in-Class Cloud-Native Price-Performance

Cloud data warehouses allow users to run analytic workloads with greater agility, better isolation and scale, and lower administrative overhead than ever before. With the ability to quickly provision on-demand and the lower fixed and administrative costs, the costs of operating a cloud data warehouse are driven mostly by the price-performance of the specific data warehouse platform.

Brick and Mortar Stores are Now Built Brick by Brick with Digital Insights

In my last three blogs (Get to Know Your Retail Customer: Accelerating Customer Insight and Relevance; Improving your Customer-Centric Merchandising with Location-based in-Store Merchandising; and Maximizing Supply Chain Agility through the “Last Mile” Commitment) I painted a picture that showed an ever-changing landscape in retail, considering that consumers are more in control than ever, mobile (at least somewhat digitally mobile considering the pandemic) and socially connected.

Building a Machine Learning Application With Cloudera Data Science Workbench And Operational Database, Part 2: Querying/ Loading Data

In this installment, we’ll discuss how to do Get/Scan Operations and utilize PySpark SQL. Afterward, we’ll talk about Bulk Operations and then some troubleshooting errors you may come across while trying this yourself. Read the first blog here. Get/Scan Operations In this example, let’s load the table ‘tblEmployee’ that we made in the “Put Operations” in Part 1. I used the same exact catalog in order to load the table. Executing table.show() will give you:

Apache NiFi - the data movement enabler in a hybrid cloud environment

Cloudera provides its customers with a set of consistent solutions running on-premises and in the cloud to ensure customers are successful in their data journey for all of their use cases, regardless of where they are deployed. Cloudera DataFlow provides Apache NiFi in both the Cloudera Data Platform Private Cloud Base (on-premises) and Public Cloud (AWS, Azure, and Google Cloud) products in this hybrid cloud strategy.

Enabling Self-Service Business Insights with Cloudera Data Warehouse

Requests to Central IT for data warehousing services can take weeks or months to deliver. Central IT teams at large organizations face a proliferation of IT projects arising from the complexities of markets and from the needs of internal lines of business (LoBs). At the same time, Central IT must juggle cost and risk.

Top 5 Questions about Apache NiFi

Over the last few weeks, I delivered four live NiFi demo sessions, showing how to use NiFi connectors and processors to connect to various systems, with 1000 attendees in different geographic regions. I want to thank you all for joining and attending these events! Interactive demo sessions and live Q&A are what we all need these days when working remotely from home is now a norm. If you have not seen my live demo session, you can catch up by watching it here.

Building a Machine Learning Application With Cloudera Data Science Workbench And Operational Database, Part 1: The Set-Up & Basics

Introduction Python is used extensively among Data Engineers and Data Scientists to solve all sorts of problems from ETL/ELT pipelines to building machine learning models. Apache HBase is an effective data storage system for many workflows but accessing this data specifically through Python can be a struggle. For data professionals that want to make use of data stored in HBase the recent upstream project “hbase-connectors” can be used with PySpark for basic operations.

Maximizing Supply Chain Agility through the "Last Mile" Commitment

In my last two blogs (Get to Know Your Retail Customer: Accelerating Customer Insight and Relevance, and Improving your Customer-Centric Merchandising with Location-based in-Store Merchandising) we looked at the benefits to retail in building personalized interactions by accessing both structured and unstructured data from website clicks, email and SMS opens, in-store point sale systems and past purchased behaviors.

An A-Z Data Adventure on Cloudera's Data Platform

In this blog we will take you through a persona-based data adventure, with short demos attached, to show you the A-Z data worker workflow expedited and made easier through self-service, seamless integration, and cloud-native technologies. You will learn all the parts of Cloudera’s Data Platform that together will accelerate your everyday Data Worker tasks.

How ASEAN Retailers Can Become insight driven with a Hybrid Cloud data strategy

There has been an e-commerce explosion this year as consumers seek safety and convenience from the comfort of their own homes using digital tools to purchase everything from durable hard goods to fashion accessories to daily living consumables like food perishables, cleaning products and even school supplies.