Systems | Development | Analytics | API | Testing

Keboola

Keboola is now officially Powered by Snowflake

Over the years, Keboola and Snowflake have seen their own share of successes and incredible achievements. Now, we can proudly announce that Keboola has joined the Powered by Snowflake program. With both companies founded around the same time, Keboola and Snowflake have been working hand in hand for some time now.

Why Doesn't the Modern Data Stack Result in a Modern Data Experience?

The data landscape is exploding with tools. As data professionals we have at our fingertips specialized tools for anything: from specialized databases (graph, geo, you name it) to tools for SQL-driven transformations (looking at you, dbt). Yet, a lot of data work is about provisioning, selecting, administering, and just maintaining those tools. Which is just a pain. As Pavel Dolezal, CEO and co-founder of Keboola said: The answer is in how the Modern Data Architecture is built.

6 Best Data Integration Tools of 2022

Data integration is the data engineering process of combining data across all the different sources in a company (CRM, SaaS apps like Salesforce, APIs, …) into a single unified view. The data integration process includes data extraction, data cleansing, data ingestion, data validation, modeling, and exposing ready-to-be-consumed data sets to other users and applications for business intelligence or data-driven activities.

7 Best Data Pipeline Tools 2022

The data pipeline is at the heart of your company’s operations. It allows you to take control of your raw data and use it to generate revenue-driving insights. However, managing all the different types of data pipeline operations (data extractions, transformations, loading into databases, orchestration, monitoring, and more) can be a little daunting. Here, we present the 7 best data pipeline tools of 2022, with pros, cons, and who they are most suitable for. 1. Keboola 2. Stitch 3. Segment 4.

Introduction to Automated Data Analytics (With Examples)

Is repetitive and menial work impeding your data scientists, analysts, and engineers from delivering their best work? Consider automating your data analytics to free their hands from routine tasks so they can dedicate their time to doing more meaningful, creative work that requires human attention. In this blog we are going to talk about: Now let’s dive in.

A Guide to Principal Component Analysis (PCA) for Machine Learning

Principal Component Analysis (PCA) is one of the most commonly used unsupervised machine learning algorithms across a variety of applications: exploratory data analysis, dimensionality reduction, information compression, data de-noising, and plenty more. In this blog, we will go step-by-step and cover: Before we delve into its inner workings, let’s first get a better understanding of PCA. Imagine we have a 2-dimensional dataset.

7 Best Change Data Capture (CDC) Tools of 2022

As your data volumes grow, your operations slow down. Data ingestion - extraction of all underlying datasets, transformation, and loading in a storage destination (such as a PostgreSQL or MySQL database) - becomes sluggish, impacting processes down the line. Affecting your data analytics and time to insights. Change Data Capture (CDC) makes data available faster, more efficiently, and without sacrificing data accuracy. In this blog we are going to overview the 7 best change data capture tools of 2022.

Keboola + ThoughtSpot = Automated insights in minutes

Keboola and ThoughtSpot partnered up to offer click-and-launch insights machines. With the original integration, you can already cut the time-to-insight. Keboola helps you get clean data and ThoughtSpot helps you turn it into insights. What’s new? The new solution builds out-of-the-box and ready-to-use data pipelines (Keboola Templates) and live self-serve analytic dashboards (ThoughtSpot SpotApps) from the ground up. You just need to click-and-launch your analytic use case.

Complete ETL Process Overview (design, challenges and automation)

The Extract, Transform, and Load process (ETL for short) is a set of procedures in the data pipeline. It collects raw data from its sources (extracts), cleans and aggregates data (transforms) and saves the data to a database or data warehouse (loads), where it is ready to be analyzed. A well-engineered ETL process provides true business value and benefits such as: Novel business insights. The entire ETL process brings structure to your company’s information.

Star Schema vs Snowflake Schema and the 7 Critical Differences

Star schemas and snowflake schemas are the two predominant types of data warehouse schemas. A data warehouse schema refers to the shape your data takes - how you structure your tables and their mutual relationships within a database or data warehouse. Since the primary purpose of a data warehouse (and other Online Analytical Processing (OLAP) databases) is to provide a centralized view of all the enterprise data for analytics, data warehouse schemas help us achieve superior analytic results.