Our reflections on the 2021 Gartner Magic Quadrant for Data Integration Tools

“The data integration tool market is seeing renewed momentum, driven by requirements for hybrid and multi-cloud data integration, augmented data management, and data fabric designs.” This is what Gartner assesses in its latest Magic Quadrant for Data Integration Tools* report. And that assessment makes perfect sense. Data is the lifeblood of an organization.

Optimizing Cloudera Data Engineering Autoscaling Performance

The shift to cloud has been accelerating, and with it, a push to modernize data pipelines that fuel key applications. That is why cloud native solutions which take advantage of the capabilities such as disaggregated storage & compute, elasticity, and containerization are more paramount than ever. At Cloudera, we introduced Cloudera Data Engineering (CDE) as part of our Enterprise Data Cloud product — Cloudera Data Platform (CDP) — to meet these challenges.

Migrating Data Pipelines from Enterprise Schedulers to Airflow

At Airflow Summit 2021, Unravel’s co-founder and CTO, Shivnath Babu and Hari Nyer, Senior Software Engineer, delivered a talk titled Lessons Learned while Migrating Data Pipelines from Enterprise Schedulers to Airflow. This story, along with the slides and videos included in it, comes from the presentation.

Automated Competition Scraping with Apify and Keboola

Whether you saw or missed our webinar, we thought it would be useful to provide a step-by-step guide on how to set up quick competition monitoring (or, any other web scraping and data processing automation) with Apify and Keboola. Thank you Apify and Revolt.bi for the collaboration! So what can you do with automated competition data processing? In this article, we’ll take an example of daily monitoring of the best-sellers list at Amazon.

Cost of ELK

Do you know how much your ELK stack costs? Managing and analyzing your data is a critical part of your business. However, the true cost of an ELK stack can be hard to calculate, and the truth is you may be spending a lot more than you think. Elasticsearch wasn't designed to work efficienctly at the scale required by today's data volume, especially the growth of log data. As your data grows, your ELK stack becomes more expensive to scale and maintain, leaving you with the headache and the tab. Well, ChaosSearch has the answer.

How to load Salesforce data into BigQuery using a code-free approach powered by Cloud Data Fusion

Organizations are increasingly investing in modern cloud warehouses and data lake solutions to augment analytics environments and improve business decisions. The business value of such repositories increases as customer relationship data is loaded and additional insights are generated.

BigQuery Admin reference guide: Recap

Over the past few weeks, we have been publishing videos and blogs that walk through the fundamentals of architecting and administering your BigQuery data warehouse. Throughout this series, we have focused on teaching foundational concepts and applying best practices observed directly from customers. Below, you can find links to each week’s content: Query Processing : Ever wonder what happens when you click “run” on a new BigQuery query?

How to Operationalize your Data Warehouse with Reverse ETL

Organizations are losing out on data-driven decision-making opportunities when data stays in the data warehouse. While business intelligence solutions can surface insights from these data sets, it often reaches team members too late to be used for daily business operations. Reverse ETL empowers organizations to increase the value of their data warehouses through operationalization. Learn how this can transform the way companies use data and insights.