Systems | Development | Analytics | API | Testing

Analytics

5 Tips to Use Heroku and ETL to Automate Reporting

Heroku is a cloud platform as a service (PaaS) for efficiently building, deploying, monitoring, and scaling applications. Originally created to work with the Ruby programming language, Heroku is now part of the Salesforce platform and supports languages such as Java, Node.js, PHP, Python, and Scala. While Heroku makes it easy to develop production-ready applications fast, one question remains: how can you integrate your Heroku app data with the rest of your data infrastructure and workflows?

Managing Python dependencies for Spark workloads in Cloudera Data Engineering

Apache Spark is now widely used in many enterprises for building high-performance ETL and Machine Learning pipelines. If the users are already familiar with Python then PySpark provides a python API for using Apache Spark. When users work with PySpark they often use existing python and/or custom Python packages in their program to extend and complement Apache Spark’s functionality. Apache Spark provides several options to manage these dependencies.

Reasons Why Cloud Migrations Fail & Ways to Succeed

Organizations are moving big data from on-premises to the cloud, using best-of-breed technologies like Databricks, Amazon EMR, Azure HDI, and Cloudera, to name a few. However, many cloud migrations fail. Why? And, how can you overcome the barriers and succeed? Join Chris Santiago, Director of Solution Engineering, as he describes the biggest pain points and how you can avoid them, and make your move to the cloud a success.

Future of Data Meetup: Exploring Data and Creating Interactive Dashboards in the Cloud

In this meetup, we’re going to once again put ourselves in the shoes of an electric car manufacturer that is deploying a recently developed electric motor out into their new cars. We’re going to show how to explore some data that has been previously collected through various different sources and stored into Apache Hive within a data warehouse, with the goal of tracking down a specific set of potentially defective parts. We’ll then take the results of this data exploration and create an interactive dashboard that presents our results in a visually appealing way using a BI tool that’s integrated right into the same data warehouse.

Fast Forward Live: Few-Shot Text Classification

Join us for this month's Machine Learning research discussion with Cloudera Fast Forward Labs. We will discuss few-shot text classification - including a live demo and Q&A. This is an applied research report by Cloudera Fast Forward. We write reports about emerging technologies. Accompanying each report are working prototypes or code that exhibits the capabilities of the algorithm and offer detailed technical advice on its practical application.