Systems | Development | Analytics | API | Testing

Analytics

3x better performance with CDP Data Warehouse compared to EMR in TPC-DS benchmark

In a previous blog post on CDW performance, we compared Azure HDInsight to CDW. In this blog post, we compare Cloudera Data Warehouse (CDW) on Cloudera Data Platform (CDP) using Apache Hive-LLAP to EMR 6.0 (also powered by Apache Hive-LLAP) on Amazon using the TPC-DS 2.9 benchmark. Amazon recently announced their latest EMR version 6.1.0 with support for ACID transactions. This benchmark is run on EMR version 6.0 as we couldn’t get queries to run successfully on version 6.1.0.

Data Lake Export Public Preview Is Now Available on Snowflake

Public preview of the data lake export feature is now available. Snowflake announced a private preview of data lake export at the Snowflake virtual summit in June 2020. Data lake export is one of the key features of the data lake workload in the Snowflake Data Cloud. The feature makes Snowflake data accessible to the external data lake, and it enables customers to take advantage of Snowflake’s reliable and performant processing capabilities.

10X Engineering Leadership Series: 21 Playbooks to Lead in the Online Era

Managing online teams has become the new normal! In an online world, how do you give effective feedback, have a difficult conversation, increase team accountability, communicate to stakeholders effectively, and so on? At Unravel, we are a fast-growing AI startup with a globally distributed engineering team across the US, EMEA, and India. Even before the pandemic this year, the global nature of our team has prepared us for effectively leading outcomes across online engineering teams.

Structured vs Unstructured Data: A Short Guide

Data is the oil that fuels the growth of modern enterprises. But unless you have the tools to unlock the potential of data, you might be left stuck on the tracks as your competitors speed ahead. With the rise of Big Data, the nature of the data that we work with has changed drastically. Data scientists like to refer to the ‘3 Vs’ of Big Data: The 3 Vs of Big Data reshaped the data landscape as we knew it.

Modernizing Data in a Cloud-Enabled World | Part 2 | Snowflake Inc.

Deloitte's partnership with Snowflake showcases how Snowflake's new Cloud Platform modernization of data helps companies migrate information & innovate within the Cloud. Frank Farrall, AI Ecosystems & Snowflake Alliance Leader at Deloitte, details how his organization uses Snowflake's cloud technology & artificial intelligence to problem solve and innovate quickly. Rise of the Data Cloud is brought to you by Snowflake.

How to configure clients to connect to Apache Kafka Clusters securely - Part 2: LDAP

In the previous post, we talked about Kerberos authentication and explained how to configure a Kafka client to authenticate using Kerberos credentials. In this post we will look into how to configure a Kafka client to authenticate using LDAP, instead of Kerberos. We will not cover the server-side configuration in this article but will add some references to it when required to make the examples clearer.

Cost Conscious Data Warehousing with Cloudera Data Platform

Have you been burned by the unexpected costs of a cloud data warehouse? If so, you know about the failed economics of some cloud-native solutions on the market today. If not, before adopting a cloud data warehouse, consider the true costs of a cloud-native data warehouse. Data warehouses have been broadly adopted to provide timely reports and valuable insights. However, traditional deployments are notoriously cumbersome and cost-prohibitive at large scales.

Extending Snowflake's External Functions with Serverless-Adding Driving Times from Mapbox to SQL

Data engineers love to use SQL to solve all kinds of data problems. For this and more, Snowflake is a perfect partner. Snowflake’s support for standard SQL and several SQL variations, combined with JavaScript stored procedures, has helped me solve complex data challenges. But sometimes you might have the need for custom code.