Systems | Development | Analytics | API | Testing

7 Tips to Improve ETL Performance

Consider for a moment, if you will, plastic patio furniture. Plastic Fantastic is a global manufacturer with several factories, warehouses, and plenty of stores. One can only imagine the sheer amount of data resulting from sales, production, suppliers, and finances. Everything that happens, from purchase and onward, to these chairs, tables, and cupboards in all corners of the world is measured.

How to Turn on Change Data Capture (CDC)

2.5 quintillion bytes of data are produced every day, and those numbers are continually increasing. With such astronomical volumes of data, businesses have to understand and interpret data faster than ever before. However, data transfers must occur for businesses with millions of data entry points to properly store and interpret their data.

Future of Data Meetup: CDP on Azure - Industrial Strength Data Engineering

Data Engineering is undergoing a huge evolution requiring faster and more reliable data pipelines. Apache Spark and Python are core foundational components of this new architecture enabling data engineers to quickly develop these pipelines. They also introduce challenges when moving to production. Come join us as we: Ask questions and learn. We will also have a raffle of Cloudera swag.

React and Respond in the Business Moment With Qlik Application Automation

Unless you’ve hidden under a rock for the past decade, you can’t have failed to notice that data in today’s enterprise is very much alive. It’s always moving, constantly changing, and we’re continually using it to create new business value. However, while data fluidity and visibility have blossomed, the opportunity to use that data to drive business actions seems to have withered in comparison.

Group vs Fine-Grained Access Control in Cloudera Data Platform Public Cloud

Cloudera Data platform (CDP) provides a Shared Data Experience (SDX) for centralized data access control and audit in the Enterprise Data Cloud. The Ranger Authorization Service (RAZ) is a new service added to help provide fine-grained access control (FGAC) for cloud storage. We covered the value this new capability provides in a previous blog.

Understanding Microsoft ETL with Azure Data Factory

Migrating analytics workloads to the public cloud has been one of the most significant big data trends in recent years—and it shows no sign of slowing down any time soon. According to a study by IT research company Forrester: Within three years, however, Forrester predicts that the fates will have reversed: Of course, before data can be processed in the public cloud, it has to get there in the first place via data migration.

Customer Data Platform (CDP) vs. Reverse ETL

Reverse ETL and customer data platforms (CDPs) are two big data trends that have been receiving a great deal of attention. While both CDPs and reverse ETL can help you make smarter data-driven decisions, there are also several crucial points of distinction. In this article, we’ll answer the question: what’s the difference between reverse ETL and a customer data platform?

Why You Need a Feature Store

Feature stores have arrived in 2021 as an essential piece of technology for operationalizing AI. Despite the enthusiasm for feature stores in high-tech companies, they are still absent from most legacy ML platforms and can be relatively unknown in many enterprise companies. We discussed how feature stores are critical to the data-first approach of next-gen ML platforms in our previous blog, but they are important enough to get their own treatment in a full article.