Does Financial Crime Increase During a Recession?

The dynamic and interconnected world of global ecommerce, crypto currencies, and alternative payments places increased pressure on anti-financial crime measures to keep pace and transform alongside these initiatives. Consumers worldwide are projected to use mobile devices to make more than 30.7 billion ecommerce transactions by 2026, a five-fold increase over the 6.1 billion predicted for 2022.

Scalable Python on BigQuery using Dask and NVIDIA GPUs

BigQuery is Google Cloud’s fully managed serverless data platform that supports querying using ANSI SQL. BigQuery also has a data lake storage engine that unifies SQL queries with other open source processing frameworks such as Apache Spark, Tensorflow, and Dask. BigQuery storage provides an API layer for OSS engines to process data. This API enables mixing and matching programming in languages like Python with structured SQL in the same data platform.

[MLOPS] From experiment management to model serving and back. A complete usecase, step-by-step!

The recording of our talk at the MLOps World summit. This talk covers a complete example, starting from experiment management and data versioning, building up into a pipeline and finally deploying using ClearML serving with drift monitoring. We then induce artifical drift to trigger the monitoring alerts and go back down the chain to quickly retrain a model and deploy it using canary deployment.

Fraud Detection With Cloudera Stream Processing Part 2: Real-Time Streaming Analytics

In part 1 of this blog we discussed how Cloudera DataFlow for the Public Cloud (CDF-PC), the universal data distribution service powered by Apache NiFi, can make it easy to acquire data from wherever it originates and move it efficiently to make it available to other applications in a streaming fashion.

What Is a Data Pipeline and Why Your Ecommerce Business Needs One

Our six key points on data pipelines include: Whether you’re a one-person show reselling items on an online marketplace or a large Ecommerce enterprise with hundreds of employees, these businesses share a common factor: both generate data. The size of your business can influence the amount of data you generate, sure. But any amount of data — if it’s not adequately accessible — is worthless. Every business, especially an Ecommerce business, needs a data pipeline.

Kafka best practices: Monitoring and optimizing the performance of Kafka applications

Apache Kafka is an open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications. Administrators, developers, and data engineers who use Kafka clusters struggle to understand what is happening in their Kafka implementations.