The data lakehouse is a promising new technology that combines aspects of data warehouses and data lakes.
Making a decision on a cloud data warehouse is a big deal. Beyond there being a number of choices each with very different strengths, the parameters for your decision have also changed. Modernizing your data warehousing experience with the cloud means moving from dedicated, on-premises hardware focused on traditional relational analytics on structured data to a modern platform.
Enterprise data warehouse platform owners face a number of common challenges. In this article, we look at seven challenges, explore the impacts to platform and business owners and highlight how a modern data warehouse can address them.
Reverse ETL is an emerging piece of the modern data stack that enables you to productionize your analytics.
We are thrilled to announce that Google has been named a Leader in The Forrester Wave™: Cloud Data Warehouse, Q1 2021 report. For more than a decade, BigQuery, our petabyte-scale cloud data warehouse, has been in a class of its own. We're excited to share this recognition and we want to thank our strong community of customers and partners for voicing their opinion. We believe this report validates the alignment of our strategy with our customers’ analytics needs.
One of the most effective ways to improve performance and minimize cost in database systems today is by avoiding unnecessary work, such as data reads from the storage layer (e.g., disks, remote storage), transfers over the network, or even data materialization during query execution. Since its early days, Apache Hive improves distributed query execution by pushing down column filter predicates to storage handlers like HBase or columnar data format readers such as Apache ORC.
Amazon Redshift is great for real-time querying, but it's not so great for handling your ETL pipeline. Fortunately, Xplenty has a highly workable solution. Xplenty can be used to offload ETL from Redshift, saving resources and allowing each platform to do what it does best: Xplenty for batch processing and Redshift for real-time querying. Redshift is Amazon’s data warehouse-as-a-service, a scalable columnar DB based on PostgreSQL.