Systems | Development | Analytics | API | Testing

Latest News

Migrating Data Pipelines from Enterprise Schedulers to Airflow

At Airflow Summit 2021, Unravel’s co-founder and CTO, Shivnath Babu and Hari Nyer, Senior Software Engineer, delivered a talk titled Lessons Learned while Migrating Data Pipelines from Enterprise Schedulers to Airflow. This story, along with the slides and videos included in it, comes from the presentation.

How to load Salesforce data into BigQuery using a code-free approach powered by Cloud Data Fusion

Organizations are increasingly investing in modern cloud warehouses and data lake solutions to augment analytics environments and improve business decisions. The business value of such repositories increases as customer relationship data is loaded and additional insights are generated.

BigQuery Admin reference guide: Recap

Over the past few weeks, we have been publishing videos and blogs that walk through the fundamentals of architecting and administering your BigQuery data warehouse. Throughout this series, we have focused on teaching foundational concepts and applying best practices observed directly from customers. Below, you can find links to each week’s content: Query Processing : Ever wonder what happens when you click “run” on a new BigQuery query?

Dimagi implements Passerelle Data Rocket to accelerate state and local COVID-19 response

Frontline healthcare providers don’t always have access to the latest and greatest technology. But when they are trying to fight a global pandemic with pen-and-paper tracking systems, something has to change. Dimagi is a tech company on a mission: to deliver scalable digital solutions for organizations to amplify their frontline impact.

Buying and selling your home with data: A Q&A with Opendoor CTO Ian Wong

While many businesses struggled to keep pace with the changing economics of a global pandemic, the real estate industry was booming. The housing market reached record-breaking heights last month, with median existing-price homes rising 17.2% over the prior year. This increase in the average cost of a house was compounded by accelerated closing times, as the average house sold in 18 days, a record low.

Spark vs. Tez: What's the Difference?

Let's get started with this great debate. First, a step back; we’ve pointed out that Apache Spark and Hadoop MapReduce are two different Big Data beasts. The former is a high-performance in-memory data-processing framework, and the latter is a mature batch-processing platform for the petabyte scale. We also know that Apache Hive and HBase are two very different tools with similar functions. Hive is a SQL-like engine that runs MapReduce jobs, while HBase is a NoSQL key/value database on Hadoop.

Enter the World of Automated Data Management and Governance with Hitachi's Lumada Data Catalog

The era of manual data management and governance is rapidly coming to a close. The size of the trove of data at nearly every company has become so enormous that it cannot be maintained using manual cleaning, cataloging, governance and search methods. The release of Lumada Data Catalog 6.1 breaks new ground in automating data management, cleaning and governance processes, making it easier to find data and grant access to those who need it.