Systems | Development | Analytics | API | Testing

Data Lakes

Transforming Manufacturing Data: The Power of Qlik and Databricks Together

Manufacturing is undergoing a massive transformation. Driven by technological advancements that generate vast amounts of data. The industry is moving towards becoming smarter, more sustainable, and services driven. The fragmented nature of manufacturing’s data architecture however, has created barriers to realizing the full value of data, with many projects stalling at the Proof-of-Concept stage.

Data lake vs. data mesh: Which one is right for you?

What’s the right way to manage growing volumes of enterprise data, while providing the consistency, data quality and governance required for analytics at scale? Is centralizing data management in a data lake the right approach? Or is a distributed data mesh architecture right for your organization? When it comes down to it, most organizations seeking these solutions are looking for a way to analyze data without having to move or transform it via complex extract, transform and load (ETL) pipelines.

Open Data Lakehouse powered by Apache Iceberg on Apache Ozone

With minimal setup, it is this simple to get started with Iceberg on Ozone in CDP Private Cloud. This ability allows you to reap the benefits of both a powerful exabyte-scale storage system and an optimized table format for petabyte-scale analytics. In this video I'm going to demonstrate how to create, upgrade and use iceberg tables on Ozone in CDP Private Cloud. Iceberg is engine agnostic and it works with most analytic query engines like Hive, Impala, Spark and so on.

Educating ChatGPT on Data Lakehouse

As the use of ChatGPT becomes more prevalent, I frequently encounter customers and data users citing ChatGPT’s responses in their discussions. I love the enthusiasm surrounding ChatGPT and the eagerness to learn about modern data architectures such as data lakehouses, data meshes, and data fabrics. ChatGPT is an excellent resource for gaining high-level insights and building awareness of any technology. However, caution is necessary when delving deeper into a particular technology.

Snowflake Workloads Explained: Data Lakes

Snowflake’s cross-cloud platform breaks down silos by supporting a variety of data types and storage patterns. Data engineers, data scientists, analysts, and developers across organizations can access governed structured, semi-structured, and unstructured data for a variety of workloads, without resource contention or concurrency issues.

Isn't the Data Warehouse the Same Thing as the Data Lakehouse?

A data lakehouse is a data storage repository designed to store both structured data and data from unstructured sources. It allows users to access data stored in different forms, such as text files, CSV or JSON files. Data stored in a data lakehouse can be used for analysis and reporting purposes.

From Data Warehouse to Lakehouse

This is a guest post for Integrate.io written by Bill Inmon, an American computer scientist recognized as the "father of the data warehouse." Inmon wrote the first book and first magazine column about data warehousing, held the first conference about this topic, and was the first person to teach data warehousing classes.

How to Integrate BI and Data Visualization Tools with a Data Lake

For the past 30 years, the primary data source for business intelligence (BI) and data visualization tools has generally been either a data warehouse or a data mart. But as enterprises today struggle to cope with the growing complexity, scale, and speed of data, it’s becoming clear that the data tools of 30 years ago weren’t designed to handle the enterprise data management challenges of today - especially with the growing variety and amounts of data that enterprises are generating.

From Data Lake to Data Mesh: How Data Mesh Benefits Businesses

Current data architecture is going through a revolution. Enterprises are starting to shift away from the monolithic data lake towards something less centralized: data mesh. It’s a relatively new concept, first coined in 2019, that addresses potential issues with data warehouses and data lakes that can cause businesses to be slow, unresponsive, or even suffer from data silos. What is a data mesh, and how could it benefit your business?

All the Features A Robust Data Lake Should Have

From databases to data warehouses and, finally, to data lakes, the data landscape is changing rapidly as volumes and sources of data increase. With a growth projection of almost 30%, the data lake market will grow from USD 3.74 billion in 2020 to USD 17.6 billion by 2026. Also, from the 2022 Data and AI Summit, it is clear that data lake architecture is the future of data management and governance.