Systems | Development | Analytics | API | Testing

Spark Troubleshooting Solutions - DataOps, Spark UI or logs, Platform or APM Tools

Spark is known for being extremely difficult to debug. But this is not all Spark’s fault. Problems in running a Spark job can be the result of problems with the infrastructure Spark is running on, inappropriate configuration of Spark, Spark issues, the currently running Spark job, other Spark jobs running at the same time – or interactions among these layers.

Migrating Data Pipelines from Enterprise Schedulers to Airflow

At Airflow Summit 2021, Unravel’s co-founder and CTO, Shivnath Babu and Hari Nyer, Senior Software Engineer, delivered a talk titled Lessons Learned while Migrating Data Pipelines from Enterprise Schedulers to Airflow. This story, along with the slides and videos included in it, comes from the presentation.

Driving Data Governance and Data Products at ING Bank France

In this episode of Data+AI Battlescars, Sandeep Uttamchandani, Unravel Data’s CDO, speaks with Samir Boualla, CDO at ING Bank France, one of the largest banks in the world. They cover his battlescars in Driving Data Governance Across Business Teams and Building Data Products. At ING Bank France, Samir is the Chief Data Officer. He’s responsible for several teams that govern, develop, and manage data infrastructure and data assets to deliver value to the business.

Spark Troubleshooting, Part 1 - Ten Challenges

“The most difficult thing is finding out why your job is failing, which parameters to change. Most of the time, it’s OOM errors…” Jagat Singh, Quora Spark has become one of the most important tools for processing data – especially non-relational data – and deriving value from it. And Spark serves as a platform for the creation and delivery of analytics, AI, and machine learning applications, among others.

Simplifying Data Management at LinkedIn Part 2

In the second of this two-part episode of Data+AI Battlescars, Sandeep Uttamchandani, Unravel Data’s CDO, speaks with Kapil Surlaker, VP of Engineering and Head of Data at LinkedIn. In part one, they covered LinkedIn’s challenges related to metadata management and data access APIs. This second part dives deep into data quality.

Simplifying Data Management at LinkedIn Part 1

In the first of this two-part episode of Data+AI Battlescars, Sandeep Uttamchandani, Unravel Data’s CDO, speaks with Kapil Surlaker, VP of Engineering and Head of Data at LinkedIn. In this first part, they cover LinkedIn’s challenges related to Metadata Management and Data Access APIs. Part 2 will dive deep into data quality.