Cost Conscious Data Warehousing with Cloudera Data Platform

Have you been burned by the unexpected costs of a cloud data warehouse? If so, you know about the failed economics of some cloud-native solutions on the market today. If not, before adopting a cloud data warehouse, consider the true costs of a cloud-native data warehouse. Data warehouses have been broadly adopted to provide timely reports and valuable insights. However, traditional deployments are notoriously cumbersome and cost-prohibitive at large scales.

Extending Snowflake's External Functions with Serverless-Adding Driving Times from Mapbox to SQL

Data engineers love to use SQL to solve all kinds of data problems. For this and more, Snowflake is a perfect partner. Snowflake’s support for standard SQL and several SQL variations, combined with JavaScript stored procedures, has helped me solve complex data challenges. But sometimes you might have the need for custom code.

Kafka Is Not a Database

It's important to understand the uses and abuses of streaming infrastructure. Apache Kafka is a message broker that has rapidly grown in popularity in the last few years. Message brokers have been around for a long time; they're a type of datastore specialized for "buffering" messages between producer and consumer systems. Kafka has become popular because it's open-source and capable of scaling to very large numbers of messages.

CaliberMind Onboards Customer Data With Fivetran

With automated data integration, CaliberMind uncovers data insights for customers. As a Customer Data Platform (CDP), CaliberMind delivers data-driven insights to its customers. To do so, it must connect to its customers’ data sources, extract, process and transform the data, run it through specially designed analytic models, and, finally, present data back to the customer as insights. CaliberMind uses Fivetran to offload the task of ingesting data from its customers’ applications.

Outlier Detection: The Different Types of Outliers

Time series anomaly detection is a tool that detects unusual behavior, whether it's hurtful or advantageous for the business. In either case, quick outlier detection and outlier analysis can enable you to adjust your course quickly, before you lose customers, revenue, or an opportunity. The first step is knowing what types of outliers you’re up against. Chief Data Scientist Ira Cohen, co-founder of Autonomous Business Monitoring platform Anodot, covers the three main categories of outliers and how you'll see them arise in a business context.

Solutions Analyst: The Career for Innovative All-Rounders

Every business wants to stay agile. They invest in analytics to learn about their customers and their internal state, and they use these insights to make bold and innovative decisions. But then they run into a common problem: how to put those decisions into action. This is where a solutions analyst comes in. These multi-talented creative thinkers will look at the current state of play and identify the smartest path forward.

Federated Learning, Machine Learning, Decentralized Data

Two years ago we wrote a research report about Federated Learning. We’re pleased to make the report available to everyone, for free. You can read it online here: Federated Learning. Federated Learning is a paradigm in which machine learning models are trained on decentralized data. Instead of collecting data on a single server or data lake, it remains in place—on smartphones, industrial sensing equipment, and other edge devices—and models are trained on-device.

How Cloudera Supports Government Data Encryption Standards

As part of our ongoing commitment to supporting Government regulations and standards in our enterprise solutions, including data protection, Cloudera recently introduced a version of our Cloudera Data Platform, Private Cloud Base product (7.1.5 release) that can be configured to use FIPS compliant cryptography.