Systems | Development | Analytics | API | Testing

Latest News

Building an automated data pipeline from BigQuery to Earth Engine with Cloud Functions

Over the years, vast amounts of satellite data have been collected and ever more granular data are being collected everyday. Until recently, those data have been an untapped asset in the commercial space. This is largely because the tools required for large scale analysis of this type of data were not readily available and neither was the satellite imagery itself. Thanks to Earth Engine, a planetary-scale platform for Earth science data & analysis, that is no longer the case.

Securing Your APIs With the Right API Management Solution

In today’s digital business landscape, APIs have become the most important digital asset. One study found that 83% of all internet traffic belongs to API-based services. APIs act as the reusable building block that employees, partners, and customers can easily access from anywhere. Figure 1: API security overview This has also made APIs the top security challenge for most companies. In a recent survey, about 94% of respondents noted that they had faced an API security issue over the past year.

Talend's contributions to Apache Beam

Apache Beam is an open-source, unified programming model for batch and streaming data processing pipelines that simplifies large-scale data processing dynamics. The Apache Beam model offers powerful abstractions that insulate you from low-level details of distributed data processing, such as coordinating individual workers, reading from sources and writing to sinks, etc.

Faster XML Parsing with Elixir

The XML data format has been around since 1996. It was first envisioned as a lingua franca (bridging language) for data to be serialized and read into completely disparate systems (with different programming languages, operating systems, and even hardware). It has been wildly successful in that goal. In software, though, 26 years is like a lifetime — and in hardware, it's an eternity.

React Testing: Best Frameworks, Libraries and Tools

The most important part of using TDD with React is picking the correct testing toolset and framework, regardless of whether you're doing unit testing, integration testing, or end-to-end testing. Selecting the appropriate toolset is crucial for using TDD in React, from testing individual components to testing the whole application. This includes a wide range of JavaScript testing frameworks and handy assertion libraries.

Analyzing satellite images in Google Earth Engine with BigQuery SQL

Google Earth Engine (GEE) is a groundbreaking product that has been available for research and government use for more than a decade. Google Cloud recently launched GEE to General Availability for commercial use. This blog post describes a method to utilize GEE from within BigQuery’s SQL allowing SQL speakers to get access to and value from the vast troves of data available within Earth Engine.

Scaling Kafka Brokers in Cloudera Data Hub

This blog post will provide guidance to administrators currently using or interested in using Kafka nodes to maintain cluster changes as they scale up or down to balance performance and cloud costs in production deployments. Kafka brokers contained within host groups enable the administrators to more easily add and remove nodes. This creates flexibility to handle real-time data feed volumes as they fluctuate.

7 Best Data Pipeline Tools 2022

The data pipeline is at the heart of your company’s operations. It allows you to take control of your raw data and use it to generate revenue-driving insights. However, managing all the different types of data pipeline operations (data extractions, transformations, loading into databases, orchestration, monitoring, and more) can be a little daunting. Here, we present the 7 best data pipeline tools of 2022, with pros, cons, and who they are most suitable for. 1. Keboola 2. Stitch 3. Segment 4.