Systems | Development | Analytics | API | Testing

Latest Blogs

Make Your AWS Data Lake Deliver with ChaosSearch (Webinar Highlights)

When CTO James Dixon coined the term “data lake” in 2011, he imagined a single storage repository where organizations could store both structured and unstructured data in their raw format until it was needed for analytics. But without the right storage technology, data governance, or analytical tools, the first data lakes quickly became “data swamps” - morasses of data with no organizational structure and no efficient way to access or extract meaningful insights.

Loading Data to Redshift: Five Options and One Solution

Around 95 percent of organizations say their inability to manage and comprehend data holds them back. It's no wonder, then, that so many of these companies are loading their data into a single location like Amazon Redshift. Redshift uses SQL to analyze data sets so users can solve organizational problems and make more profitable business decisions.

7 HCM Integration Best Practices

Human capital management (HCM) is a set of practices used by the human resources department for onboarding new talent. These practices focus on the needs of the organization to fulfill specific vacancies and are implemented across three main categories: hiring, managing and optimizing the workforce. HR departments rely heavily on their apps.

Upgrade Hortonworks Data Platform (HDP) to Cloudera Data Platform (CDP) Private Cloud Base

CDP Private Cloud Base is an on-premises version of Cloudera Data Platform (CDP). This new product combines the best of Cloudera Enterprise Data Hub and Hortonworks Data Platform Enterprise along with new features and enhancements across the stack. This unified distribution is a scalable and customizable platform where you can securely run many types of workloads. CDP is an easy, fast, and secure enterprise analytics and management platform with the following capabilities.

Connected Apps or Managed Apps: Which Model to Implement?

We recently wrote about the interest we’re seeing in connected applications that are built on Snowflake. Connected applications separate code and data such that the app provider creates and maintains the application code, while their customers manage their own data and provide their data platform for processing the application’s data. Some of our partners choose the connected application model because it has benefits for both customers and application providers.

Operationalizing Data Pipelines With Snowpark Stored Procedures, Now in Preview

Following the recent GA of Snowpark for our customers on AWS, we’re happy to announce that Snowpark Scala stored procedures are now available in preview to all customers on all clouds. Snowpark provides a language-integrated way to build and run data pipelines using powerful abstractions like DataFrames. With Snowpark, you write a client-side program to describe the pipeline you want to run, and all of the heavy lifting is pushed right into Snowflake’s elastic compute engine.

How to get data from Keboola to Google Data Studio?

Google Data Studio is a beautiful visualization tool that turns your data into compelling story-telling reports. But before you can visualize your data, you have to collect it, clean it, and validate it. This is where Keboola comes in. Keboola is the Data Stack as a Service (DaaS) platform that helps you with all your data operations - from building and automating ETL pipelines to data governance.

What Is Gherkin & How Do You Write Gherkin Tests?

When it comes to writing and testing software, teams have a lot of alternatives. How do you know what syntax to use and which testing solution is best for you? In this post, we'll look at how to utilize Gherkin and Gherkin tests. We'll go through the syntax, how to construct a test, and the benefits and drawbacks.