Analytics

Benchmarking Ozone: Cloudera's next-generation Storage for CDP

Apache Hadoop Ozone was designed to address the scale limitation of HDFS with respect to small files and the total number of file system objects. On current data center hardware, HDFS has a limit of about 350 million files and 700 million file system objects. Ozone’s architecture addresses these limitations[4]. This article compares the performance of Ozone with HDFS, the de-facto big data file system.

How to get value out of your embedded analytics

Over the years, we’ve worked with a lot of software vendors who have embedded analytics into their product and there’s a range of reasons why they’ve chosen to do that. Some want to modernize existing analytics with a better solution, while others want to engage with more users or extend the use of their application to the C-Suite by delivering something of value to management like reporting.

Searcher Seismic is utilizing seismic data for the oil and gas industry providing a map to de-risk exploration

In today’s age of technology, the processing of seismic data requires powerful computers, talented researchers, software, and skills. For the Oil and Gas Industry, its paramount to making strategic business decisions. Seismic data accurately helps to plan for wells, reduce the need for further exploration, and minimizes the impact on the environment.

Fresh Features: first-rate filters

Filtering is an underappreciated feature of business intelligence and analytics. Yet filters are critical to data analysis. Filters will probably be the primary method, of all the possible interaction types, that end users utilize. Welcome to part 3 of Yellowfin 9 Fresh Features. If you missed part 2, check out the enhancements to Yellowfin's automated data discovery - Signals. Yellowfin has a rich filter functionality that isn't available in some of the other leading analytics platforms.

Disk and Datanode Size in HDFS

This blog discusses answers to questions like what is the right disk size in datanode and what is the right capacity for a datanode. A few of our customers have asked us about using dense storage nodes. It is certainly possible to use dense nodes for archival storage because IO bandwidth requirements are usually lower for cold data. However the decision to use denser nodes for hot data must be evaluated carefully as it can have an impact on the performance of the cluster.

The Value of Options in the Data Integration and Analytics Supply Chain

Over the course of my career in Financial Services, I have struggled with how few options I really had when it came to delivering the right information, to the right people, at the right time. It sounds sort of ridiculous considering how much time, money and effort the firms, for which I worked, spent on data warehouses, reporting systems, business intelligence tools and advanced analytics.

Why data catalogs are on the rise

A really interesting development I’ve seen in the data and analytic space lately is the rise of the data catalog. You may know these by another name such as a semantic or metadata layer, but they’re all fundamentally the same thing. Data catalogs aren’t new, they’ve been around for a long time. While some vendors like Yellowfin and Cognos have always had them, others like Tableau and Qlik are now just getting to them.

Why data catalogs are on the rise

A really interesting development I’ve seen in the data and analytic space lately is the rise of the data catalog. You may know these by another name such as a semantic or metadata layer, but they’re all fundamentally the same. Data catalogs aren’t new, they’ve been around for a long time. While some vendors like Yellowfin and Cognos have always had them, others like Tableau and Qlik are now just getting to them.