Analytics

What are ETL tools?

Thinking of building out an ETL process or refining your current one? Read more to learn about how ETL tools give you time to focus on building data models. ETL stands for extract-transform-load, and is commonly used when referring to the process of data integration. Extract refers to pulling data from a particular data source. Transforms are used to make that data into a processable format. Load is the final step to drop the data into the designated target.

Achieve Pin-Point Historical Analysis of Your Salesforce Data

Want to look at how data has changed over time? Simply enable history mode, a Fivetran feature that data analysts can turn on for specific tables to analyze historical data. The feature achieves Type 2 Slowly Changing Dimensions (Type 2 SCD), meaning a new timestamped row is added for every change made to a column. We launched history mode for Salesforce in May and have been delighted with the response.

Moving Big Data and Streaming Data Workloads to AWS

Cloud migration may be the biggest challenge, and the biggest opportunity, facing IT departments today - especially if you use big data and streaming data technologies, such as Cloudera, Hadoop, Spark, and Kafka. In this 55-minute webinar, Unravel Data product marketer Floyd Smith and Solutions Engineering Director Chris Santiago describe how to move workloads to AWS EMR, Databricks, and other destinations on AWS, fast and at the lowest possible cost.

Fivetran vs. MuleSoft vs. Xplenty : An ETL Comparison

The key differences between Fivetran, MuleSoft, and Xplenty: Hiring a data scientist or engineer can cost up to $140,000 per year —something many businesses can't afford. Still, organizations need to pull data from different locations into a data lake or warehouse for business insights. An Extract, Transform, and Load (ETL) platform makes this process easier, but few organizations have the technical or coding know-how to make it happen.

How leading organizations govern their data to find success

With the increased focus on delivering value customers, it is imperative to build a next generation customer hub that delivers high quality and governed data. In this video we will share best practices for implementing a comprehensive data governance approach. Learn how to leverage the capabilities of the Talend Data Fabric to deploy a forward-looking data management architecture that detects and retrieves metadata from across databases and applications, builds data lineage, and adds traceability.

How to configure clients to connect to Apache Kafka Clusters securely - Part 1: Kerberos

This is the first installment in a short series of blog posts about security in Apache Kafka. In this article we will explain how to configure clients to authenticate with clusters using different authentication mechanisms.

Hive vs. SQL: Which One Performs Data Analysis Better?

Key differences between Hive and SQL: Big data requires powerful tools. Successful organizations query, manage and analyze thousands of data sets from hundreds of data sources. This is where tools like Hive and SQL come in. Although very different, both query and program big data. But which tool is right for your organization? In this review, we compare Hive vs. SQL on features, prices, support, user scores, and more.