Analytics

Enabling high-speed Spark direct reader for Apache Hive ACID tables

Apache Hive supports transactional tables which provide ACID guarantees. There has been a significant amount of work that has gone into hive to make these transactional tables highly performant. Apache Spark provides some capabilities to access hive external tables but it cannot access hive managed tables. To access hive managed tables from spark Hive Warehouse Connector needs to be used.

Use AI To Quickly Handle Sensitive Data Management

The growing waves of data that you’re pulling in include sensitive, personal or confidential data. This can become a compliance nightmare, especially with rules around PII, GDPR and CCPA, and it takes too much time to manually decide what should be protected. In this session, we will show how AI-driven data catalogs can identify sensitive data and share  that identification with your data security platforms to automate its discovery, identification and security.  You'll see how this dramatically reduces your time to onboard data and makes it safely available  to your business  communities.

What is data modeling and how can you model data for higher analytical outputs?

Being data-driven helps businesses to cut costs and produce higher returns on investments, increasing their financial viability in the fight for a piece of the market pie. But *becoming* data-driven is a more labor-intensive process. In the same way that companies must align themselves around business objectives, data professionals must align their data around data models. In other words: if you want to run a successful data-driven operation, you need to model your data first.

Amazon EMR Insider Series: Optimizing big data costs with Amazon EMR & Unravel

Data is a core part of every business. As data volumes increase so do costs of processing it. Whether you are running your Apache Spark, Hive, or Presto workloads on-premise or on AWS, Amazon EMR is a sure way to save you money. In this session, we’ll discuss several best practices and new features that enable you to cut your operating costs and save money when processing vast amounts of data using Amazon EMR.

5 Pointers For Great Analytics Storytelling

Most of us know the story of “The Tortoise and the Hare.” It is one of Aesop’s classic fables in which a speedy, overconfident hare becomes complacent and realizes, all too late, that the tortoise, although outmatched, has managed to beat him in a race. It teaches us lessons about overconfidence and perseverance and has caused phrases like “slow and steady wins the race” to creep into our everyday language.

Adoption of a Cloud Data Platform, Intelligent Data Analytics While Maintaining Security, Governance and Privacy

“You cannot be the same, think the same and act the same if you hope to be successful in a world that does not remain the same.” This sentence by John C. Maxwell is so relevant to rapidly changing cloud hosting technology. Businesses understand the added value and are looking at cloud technologies to handle both operational and analytical workloads.

Digital Transformation is Way More than Just Digital

Over the last 25 years, I have an unparalleled front seat to the digital transformation that is now accelerating in the connected manufacturing and automotive industry. Not many people have had the opportunity to witness the transformation and be as active in this area as I have; I consider myself lucky.