Systems | Development | Analytics | API | Testing

Latest News

Let Real-Time Data Visualization Drive Your Storytelling

Stories are the crux of effective communication. According to a Stanford University study, nearly two-thirds of people remember a story that’s part of a presentation. The study also found that speakers who merely present facts and figures only achieve a 5% recall rate among their audience. When your customers deliver analytics and reporting, the data visualization experience should be a memorable one.

Comparing Data Visualizations: Bar vs. Stacked, Icons vs. Shapes, and Line vs. Area

Great data visualizations have the power to persuade decision makers to take immediate, appropriate action. When done well, data visualizations help users intuitively grasp data at a glance and provide more meaningful views of information in context. Good data visuals give busy workers a high-level summary of important data. They also offer a big-picture perspective and highlight trends, anomalies, and outliers while giving users the option to drill down into details and ask new questions when needed.

From Hive Tables to Iceberg Tables: Hassle-Free

For more than a decade now, the Hive table format has been a ubiquitous presence in the big data ecosystem, managing petabytes of data with remarkable efficiency and scale. But as the data volumes, data variety, and data usage grows, users face many challenges when using Hive tables because of its antiquated directory-based table format. Some of the common issues include constrained schema evolution, static partitioning of data, and long planning time because of S3 directory listings.

12 Times Faster Query Planning With Iceberg Manifest Caching in Impala

Iceberg is an emerging open-table format designed for large analytic workloads. The Apache Iceberg project continues developing an implementation of Iceberg specification in the form of Java Library. Several compute engines such as Impala, Hive, Spark, and Trino have supported querying data in Iceberg table format by adopting this Java Library provided by the Apache Iceberg project.

Integrating Cloudera Data Warehouse with Kudu Clusters

Apache Impala and Apache Kudu make a great combination for real-time analytics on streaming data for time series and real-time data warehousing use cases. More than 200 Cloudera customers have implemented Apache Kudu with Apache Spark for ingestion and Apache Impala for real-time BI use cases successfully over the last decade, with thousands of nodes running Apache Kudu.

How to Evolve Your Power BI Solution With Yellowfin

Microsoft Power BI is a ubiquitous and cheap to start with business intelligence (BI) tool that can create a good foundation for analytics capabilities at any company. Similar to Tableau, the objective is to create broad adoption within an organization and replace Excel with a more powerful and structured tool.