Systems | Development | Analytics | API | Testing

Latest News

3-Minute Recap: Unlocking the Value of Cloud Data and Analytics

DBTA recently hosted a roundtable webinar with four industry experts on “Unlocking the Value of Cloud Data and Analytics.” Moderated by Stephen Faig, Research Director, Unisphere Research and DBTA, the webinar featured presentations from Progress, Ahana, Reltio, and Unravel. You can see the full 1-hour webinar “Unlocking the Value of Cloud Data and Analytics” below. Here’s a quick recap of what each presentation covered.

Get Ready for the Next Generation of DataOps Observability

I was chatting with Sanjeev Mohan, Principal and Founder of SanjMo Consulting and former Research Vice President at Gartner, about how the emergence of DataOps is changing people’s idea of what “data observability” means. Not in any semantic sense or a definitional war of words, but in terms of what data teams need to stay on top of an increasingly complex modern data stack.

Yellowfin Named Embedded Business Intelligence Software Leader in G2 Fall Reports 2022

Yellowfin has again been recognized in the Leader quadrant in the 2022 G2 Fall Grid Reports for Embedded Business Intelligence (Enterprise and Small Business). This is Yellowfin's 13th quarter in a row to be named a leader in a G2 Grid Report. The Yellowfin team are grateful to our customers for the reviews they have provided for our embedded analytics capability and product suite on G2, a leading business software and service comparison source for trusted user ratings and peer-to-peer reviews.

Talend's contributions to Apache Beam

Apache Beam is an open-source, unified programming model for batch and streaming data processing pipelines that simplifies large-scale data processing dynamics. The Apache Beam model offers powerful abstractions that insulate you from low-level details of distributed data processing, such as coordinating individual workers, reading from sources and writing to sinks, etc.

Building an automated data pipeline from BigQuery to Earth Engine with Cloud Functions

Over the years, vast amounts of satellite data have been collected and ever more granular data are being collected everyday. Until recently, those data have been an untapped asset in the commercial space. This is largely because the tools required for large scale analysis of this type of data were not readily available and neither was the satellite imagery itself. Thanks to Earth Engine, a planetary-scale platform for Earth science data & analysis, that is no longer the case.

Analyzing satellite images in Google Earth Engine with BigQuery SQL

Google Earth Engine (GEE) is a groundbreaking product that has been available for research and government use for more than a decade. Google Cloud recently launched GEE to General Availability for commercial use. This blog post describes a method to utilize GEE from within BigQuery’s SQL allowing SQL speakers to get access to and value from the vast troves of data available within Earth Engine.

How to simplify and fast-track your data warehouse migrations using BigQuery Migration Service

Migrating data to the cloud can be a daunting task. Especially moving data from warehouses and legacy environments requires a systematic approach. These migrations usually need manual effort and can be error-prone. They are complex and involve several steps such as planning, system setup, query translation, schema analysis, data movement, validation, and performance optimization.

Scaling Kafka Brokers in Cloudera Data Hub

This blog post will provide guidance to administrators currently using or interested in using Kafka nodes to maintain cluster changes as they scale up or down to balance performance and cloud costs in production deployments. Kafka brokers contained within host groups enable the administrators to more easily add and remove nodes. This creates flexibility to handle real-time data feed volumes as they fluctuate.

How to Distribute Machine Learning Workloads with Dask

Tell us if this sounds familiar. You’ve found an awesome data set that you think will allow you to train a machine learning (ML) model that will accomplish the project goals; the only problem is the data is too big to fit in the compute environment that you’re using. In the day and age of “big data,” most might think this issue is trivial, but like anything in the world of data science things are hardly ever as straightforward as they seem.