Systems | Development | Analytics | API | Testing

BI

Making data-intensive processing efficient and portable with Apache Beam

The appearance of Hadoop and its related ecosystem was like a Cambrian explosion of open source tools and frameworks to process big amounts of data. But companies who invested early in big data found some challenges. For example, they needed engineers with expert knowledge not only on distributed systems and data processing but also on Java and the related JVM-based languages and tools.

How to Develop a Data Processing Job Using Apache Beam

Are you familiar with Apache Beam? If not, don’t be ashamed, as one of the latest projects developed by the Apache Software Foundation and first released in June 2016, Apache Beam is still relatively new in the data processing world. As a matter of fact, it wasn’t until recently when I started to work closely with Apache Beam, that I loved to learn and learned to love everything about it.

Why Paddy Power Betfair Bet on a Cloud Architecture for Big Data

Formed in 2016, Paddy Power Betfair (PPB) is the world’s largest publicly quoted sports betting and gaming company, bringing “excitement to life” for five million customers worldwide. The merger of Paddy Power and Betfair in 2016 created an additional data challenge for an already highly data-driven organization. The merged company had to bring together 70TB of data, from dozens of sources, into an integrated platform.

How to Go Serverless with Talend & AWS Lambda

Recently I found myself in a predicament that many of you can relate to, trying to update an aging application that has become too difficult to manage and too costly to continue operating. As we started to talk about what to do, we concluded it was time to start decomposing that application into smaller more manageable pieces.