Systems | Development | Analytics | API | Testing

Analytics

CSV Formatting: Tips and Tricks for Data Accuracy

Comma-Separated Values (CSV) files are at the cornerstone of data management. They offer a simplistic yet versatile format to organize and exchange data. CSV files are predominantly used in data analysis, machine learning, and database migrations. Their ability to encapsulate large datasets in a plain-text format makes them instrumental for these use cases.

How to Use Confluent for Kubernetes to Manage Resources Outside of Kubernetes

Apache Kafka® cluster administrators often need to solve problems like how to onboard new teams, manage resources like topics or connectors, and maintain permission control over these resources. In this post, we will demonstrate how to use Confluent for Kubernetes (CfK) to enable GitOps with a CI/CD pipeline and delegate resource creation to groups of people without distributing admin permission passwords to other people in the organization.

Accelerating Queries on Iceberg Tables with Materialized Views

This blog post describes support for materialized views for the Iceberg table format in Cloudera Data Warehouse. Apache Iceberg is a high-performance open table format for petabyte-scale analytic datasets. It has been designed and developed as an open community standard to ensure compatibility across languages and implementations.

Top 3 Data + AI Predictions for Manufacturing in 2024

Investment in AI for manufacturing is expected to grow by 57% by 2026. That’s hardly surprising — with AI’s ability to augment worker productivity, improve efficiency and drive innovation, its potential in manufacturing is vast. AI’s predictive capabilities can help manufacturing leaders anticipate market trends and make data-driven decisions, creating financial opportunities for suppliers as well as customers.

Tabular Reporting - Do More with Qlik Webinar Replay

This session will demonstrate how Tabular Reporting used within Qlik Sense Applications enables users to efficiently address and manage common operational report creation and distribution requirements. Attendees will discover how report developers can create formatted Excel Templates directly from Qlik data and visualizations. The webinar will also highlight the power of governed Report Tasks, showcasing the seamless distribution and “bursting” of reports to stakeholders. By leveraging Tabular Reporting, the Qlik platform becomes the central source for crucial operational decisions, customer communications, and more.

What is the Transactional Outbox Pattern? | Designing Event-Driven Microservices

The transactional outbox pattern leverages database transactions to update a microservice's state and an outbox table. Events in the outbox will be sent to an external messaging platform such as Apache Kafka. This technique is used to overcome the dual-write problem which occurs when you have to write data to two separate systems such as a database and Apache Kafka. The database transactions can be used to ensure atomic writes between the two tables. From there, a separate process can consume the outbox and update the external system as required.

The Best Data Lake Tools: A Buyer's Guide

A data lake is a main storage repository that can hold vast amounts of raw, unstructured data. A data lake is not the same as a data warehouse, which maintains data in structured files. Five key takeaways about data lake tools: A data warehouse uses a hierarchical structure, whereas the architecture of a data lake is flat.