Analytics

Streaming Pipelines With Snowflake Explained In 2 Minutes

Streaming data has been historically complex and costly to work with. That's no longer the case with Snowflake's streaming capabilities. Together, Snowpipe Streaming and Dynamic Tables (in public preview) break the barrier between batch and streaming systems. Now you can build low-latency data pipelines with serverless row-set ingestion and declarative pipelines with SQL. You can easily adapt to your business requirements to change latency as a single parameter.

Securely Connect to LLMs and Other External Services from Snowpark

Snowpark is the set of libraries and runtimes that enables data engineers, data scientists and developers to build data engineering pipelines, ML workflows, and data applications in Python, Java, and Scala. Functions or procedures written by users in these languages are executed inside of Snowpark’s secure sandbox environment, which runs on the warehouse.

How to Run Apache Kafka on Windows

Is Windows your favorite development environment? Do you want to run Apache Kafka® on Windows? Thanks to the Windows Subsystem for Linux 2 (WSL 2), now you can, and with fewer tears than in the past. Windows still isn’t the recommended platform for running Kafka with production workloads, but for trying out Kafka, it works just fine. Let’s take a look at how it’s done.

Leveraging Machine Learning in Product Analytics for Enhanced Insights and Actionability

Product analytics traditionally hinged on examining user interactions to extract actionable insights. The integration of machine learning (ML) has elevated this process, enriching our understanding and our ability to predict future trends. Let's unfold how ML integrates into product analytics and the transformative advantages it introduces. ‍

Choosing the Right ETL Tool for Google BigQuery Storage

Google BigQuery is a robust and scalable cloud-based data warehouse that allows storing and analyzing vast amounts of data. BigQuery is a natural choice if your data already exists on the Google Cloud Platform (GCP). But before you leverage the platform, you need to extract the source data, carry out transformations, and load the data into your data lake or warehouse. This is where the ETL process and the ETL tools play a significant role.

New Fivetran connector streamlines data workflows for real-time insights

In a survey by the Harvard Business Review, 87% of respondents stated their organizations would be more successful if frontline workers were empowered to make important decisions in the moment. And 86% of respondents stated that they needed better technology to enable those in-the-moment decisions. Those coveted insights live at the end of a process lovingly known as the data pipeline.

Design and Deployment Considerations for Deploying Apache Kafka on AWS

Various factors can impede an organization's ability to leverage Confluent Cloud, ranging from data locality considerations to stringent internal prerequisites. For instance, specific mandates might dictate that data be confined within a customer's Virtual Private Cloud (VPC), or necessitate operation within an air-gapped VPC. However, a silver lining exists even in such circumstances, as viable alternatives remain available to address these specific scenarios.