Systems | Development | Analytics | API | Testing

BI

Best practices of migrating Hive ACID Tables to BigQuery

Are you looking to migrate a large amount of Hive ACID tables to BigQuery? ACID enabled Hive tables support transactions that accept updates and delete DML operations. In this blog, we will explore migrating Hive ACID tables to BigQuery. The approach explored in this blog works for both compacted (major / minor) and non-compacted Hive tables. Let’s first understand the term ACID and how it works in Hive. ACID stands for four traits of database transactions.

Struggling to Scale: How Finance Can Do More with Less

As the strategic role of finance teams continues to evolve, the Office of the CFO faces many new responsibilities. Resource allocation, however, does not always grow in tandem with those responsibilities, leading to scalability challenges for finance teams tasked with doing more with fewer resources.

Fraud Detection with Cloudera Stream Processing

This video shows how Cloudera DataFlow powered by Apache NiFi solves the first-mile problem by making it easy and efficient to acquire, transform, and move data so that we can enable streaming analytics use cases with very little effort. It will also briefly discuss the advantages of running this flow in a cloud-native Kubernetes deployment of Cloudera DataFlow. Then, we will explore how we can run real-time streaming analytics using Apache Flink, and we will use Cloudera SQL Stream Builder GUI to easily create streaming jobs using only SQL language (no Java/Scala coding required).

How Universal Data Distribution Accelerates Complex DoD Missions

We’ve come a long way since 1778 when George Washington’s spies gathered and shared military intelligence on the British Army’s tactical operations in occupied New York. But information broadly, and the management of data specifically, is still “the” critical factor for situational awareness, streamlined operations, and a host of other use cases across today’s tech-driven battlefields.

Getting Started with Cloudera Stream Processing Community Edition

Cloudera has a strong track record of providing a comprehensive solution for stream processing. Cloudera Stream Processing (CSP), powered by Apache Flink and Apache Kafka, provides a complete stream management and stateful processing solution. In CSP, Kafka serves as the storage streaming substrate, and Flink as the core in-stream processing engine that supports SQL and REST interfaces.