Systems | Development | Analytics | API | Testing

Latest Posts

Amazon RDS: The Best Relational Database Service?

Companies these days are handling more data than ever: an average of 163 terabytes (163,000 gigabytes), according to a survey by IDG. Efficiently storing, processing and analyzing this data is essential in order to glean valuable insights and make informed business decisions. Yet the question remains: What is the best way to store enterprise data? For many use cases, the most appealing choice is a relational database.

Multi-Cloud Data Analytics: What, Why, and How

What is multi-cloud data analytics and why are so many companies getting on board? Cloud computing itself is now a well-established best practice, but a multi-cloud strategy is nearly as common these days. While 94 percent of organizations are now using cloud computing, 84 percent are using a multi-cloud data strategy. Multi-cloud is an especially fruitful data strategy for companies pursuing data analytics.

How to Offload ETL from Redshift to Xplenty

Amazon Redshift is great for real-time querying, but it's not so great for handling your ETL pipeline. Fortunately, Xplenty has a highly workable solution. Xplenty can be used to offload ETL from Redshift, saving resources and allowing each platform to do what it does best: Xplenty for batch processing and Redshift for real-time querying. Redshift is Amazon’s data warehouse-as-a-service, a scalable columnar DB based on PostgreSQL.

5 Customer Data Integration Best Practices

For the last few years, you have heard the terms "data integration" and "data management" dozens of times. Your business may already invest in these practices, but are you benefitting from this data gathering? Too often, companies hire specialists, collect data from many sources and analyze it for no clear purpose. And without a clear purpose, all your efforts are in vain. You can take in more customer information than all your competitors and still fail to make practical use of it.

Protecting Personal Data: GDPR, CCPA, and the Role of ETL

The growth of data has been exponential. By 2023, it's anticipated that approximately 463 exabytes (EB) will be created every day. To put this into perspective, one exabyte is a unit equivalent to 1 billion gigabytes. By 2021, 320 billion emails will be sent daily, many of which contain personal information. Data collected around the globe contains the type of information that businesses leverage to make more informed decisions.

Using Xplenty with Parquet for Superior Data Lake Performance

Building a data lake in Amazon S3 using AWS Spectrum to query the data from a Redshift cluster is a common practice. However, when it comes to boosting performance, there are some tricks that are worth learning. One of those is using data in Parquet format, which Redshift considers a best practice. Here's how to use Parquet format with Xplenty for the best data lake performance.

What Is a Data Stack?

These days, there are two kinds of businesses: data-driven organizations; and companies that are about to go bust. And often, the only difference is the data stack. Data quality is an existential issue—to survive, you need a fast, reliable flow of information. The data stack is the entire collection of technologies that make this possible. Let's take a look at how any company can assemble a data stack that's ready for the future.

Introducing Component Previewer

The component previewer is a feature that allows you to preview your data at each component step without having to validate packages and run full-scale production jobs. It gives you the ability to extract, transform and preview your data on any transformation component, allowing you to debug your pipeline and/or to confirm and validate your data flow logic. Component previews are similar to the data previews available on source components, which you might already be familiar with.

Scheduling With Cron Expressions in Xplenty

One of the most requested features in a data integration tool is greater flexibility around the scheduling of packages and workflows. With Xplenty, this can be achieved through the use of our Cron Expression scheduling feature. Cron is a software utility that enables Unix-based operation systems, such as Linux, to use a job scheduler. You can create cron jobs, which execute a script or command at a time of your choosing. Cron has broad applications for tasks that need time-based automation.