Systems | Development | Analytics | API | Testing

Analytics

Snowflake Workloads Explained: Data Lakes

Snowflake’s cross-cloud platform breaks down silos by supporting a variety of data types and storage patterns. Data engineers, data scientists, analysts, and developers across organizations can access governed structured, semi-structured, and unstructured data for a variety of workloads, without resource contention or concurrency issues.

DataFinOps: More on the menu than data cost governance

IT and data executives find themselves in a quandary about deciding how to wrangle an exponentially increasing volume of data to support their business requirements – without breaking an increasingly finite IT budget. Like an overeager diner at a buffet who’s already loaded their plate with the cheap carbs of potatoes and noodles before they reach the protein-packed entrees, they need to survey all of the data options on the menu before formulating their plans for this trip.

Isn't the Data Warehouse the Same Thing as the Data Lakehouse?

A data lakehouse is a data storage repository designed to store both structured data and data from unstructured sources. It allows users to access data stored in different forms, such as text files, CSV or JSON files. Data stored in a data lakehouse can be used for analysis and reporting purposes.

Business Metric Strategies: How To Choose the Right Framework To Measure Success

Business metrics provide a quantifiable way to measure the success of a business. They help organizations to track their progress internally while also serving as a way to communicate the performance of a business to stakeholders and external parties. There are hundreds of metrics that could be factored into these calculations, but they need to be specific to an organization to be effective.

Transaction Support Using Apache Phoenix

This video provides a short demo on Apache Phoenix transaction support in Cloudera Operational Database (COD). COD supports Apache OMID (Optimistically Transaction Management In Datastores) transactional framework. The transaction support in COD enables you to perform complex distributed transactions and run atomic cross-row and cross-table database operations. The atomic database operations ensure that your database operations must either be completed or terminated.

The 7 Best Airflow Alternatives in 2023

Who doesn’t love Apache Airflow? The Python-based open-source tool allows us to schedule and automate workflows with DAGs (Directed Acyclic Graphs). Data teams use Airflow for a myriad of use cases: from building ETL data pipelines to launching machine learning apps. The open-source tool makes workflow management easy: it is extensible, easy to monitor from the intuitive user interface in real time, and it allows you to build dependencies between jobs.