Systems | Development | Analytics | API | Testing

Latest News

Spark APM - What is Spark Application Performance Management

Apache Spark is a fast and general-purpose engine for large-scale data processing. It’s most widely used to replace MapReduce for fast processing of data stored in Hadoop. Designed specifically for data science, Spark has evolved to support more use cases, including real-time stream event processing. Spark is also widely used in AI and machine learning applications.

Cloudera Replication Plugin enables x-platform replication for Apache HBase

The Cloudera Data Platform (CDP) is the latest Big Data offering from Cloudera. It includes Apache HBase and Phoenix as part of the platform. These two components are provided in 3 form-factors: Cloudera’s Apache HBase customers typically run mission-critical applications that cannot afford any downtime. They need a way to migrate to a new deployment either without a production outage or, at a minimum, a tiny outage.

The role of data in COVID-19 vaccination record keeping

The role of data in COVID-19 vaccination record keeping Now that the Pfizer vaccine has been approved by the FDA for use in the US, and the Moderna vaccine likely isn’t far behind, we are now on the verge of being able to emerge from the social distancing world that began earlier in 2020. Recent news has talked about distributing a vaccination record card to everyone who gets a COVID-19 vaccine.

How businesses use automated monitoring

One of the big trends we’ve seen this year is organizations going direct to consumer. Manufacturers who sold through retail outlets are moving online, and as a result a huge amount of digital transformation is occurring. A customer of ours has done exactly that. Kyowa is a Japanese cosmetics and health food company and they’ve moved from retail to online and digital, and Yellowfin has been a significant part of that journey. In particular, they’ve used Signals.

405% 3-year ROI Procuring Snowflake Through AWS Marketplace: New Forrester TEI Study

Snowflake is delighted to share the findings of a new Forrester Consulting Total Economic Impact™ (TEI) study that examines the potential return on investment for organizations that procure Snowflake through Amazon Web Services (AWS) Marketplace and then use Snowflake as a core part of your application’s architecture. We commissioned the study in partnership with AWS.

Managing Snowflake's Compute Resources

This is the 3rd blog in our series on Snowflake Resource Optimization. In parts 1 and 2 of this blog series, we showed you how Snowflake’s unique architecture allows for a virtually unlimited number of compute resources to be accessed near-instantaneously. We also provided best practices for administering these compute resources to optimize performance and reduce credit consumption.

Bringing transaction support to Cloudera Operational Database

We’re excited to share that after adding ANSI SQL, secondary indices, star schema, and view capabilities to Cloudera’s Operational Database, we will be introducing distributed transaction support in the coming months. The ACID model of database design is one of the most important concepts in databases. ACID stands for atomicity, consistency, isolation, and durability. For a very long time, strict adherence to these four properties was required for a commercially successful database.

How does Apache Spark 3.0 increase the performance of your SQL workloads

Across nearly every sector working with complex data, Spark has quickly become the de-facto distributed computing framework for teams across the data and analytics lifecycle. One of most awaited features of Spark 3.0 is the new Adaptive Query Execution framework (AQE), which fixes the issues that have plagued a lot of Spark SQL workloads. Those were documented in early 2018 in this blog from a mixed Intel and Baidu team.