July 2022

No pipelines needed. Stream data with Pub/Sub direct to BigQuery

Jul 28, 2022 By Qiqi Wu In Google BigQuery

Pub/Sub’s ingestion of data into BigQuery can be critical to making your latest business data immediately available for analysis. Until today, you had to create intermediate Dataflow jobs before your data could be ingested into BigQuery with the proper schema. While Dataflow pipelines (including ones built with Dataflow Templates) get the job done well, sometimes they can be more than what is needed for use cases that simply require raw data with no transformation to be exported to BigQuery.

Read Post

Google BigQuery

Read more about No pipelines needed. Stream data with Pub/Sub direct to BigQuery

How to get started with BigQuery

Jul 28, 2022 By Google BigQuery In Google BigQuery

Here to bring you the latest news in the startup program by Google Cloud are Valeriya Shin and Mathilde Bachy! Welcome to the second season of the Google Cloud Technical Guides for Startups - the Build Series. Build Series - Episode 4: How to get started with BigQuery

View Video

Google BigQuery

Analytics
BI

Read more about How to get started with BigQuery

Scalable Python on BigQuery using Dask and NVIDIA GPUs

Jul 19, 2022 By Dong Meng In Google BigQuery

BigQuery is Google Cloud’s fully managed serverless data platform that supports querying using ANSI SQL. BigQuery also has a data lake storage engine that unifies SQL queries with other open source processing frameworks such as Apache Spark, Tensorflow, and Dask. BigQuery storage provides an API layer for OSS engines to process data. This API enables mixing and matching programming in languages like Python with structured SQL in the same data platform.

Read Post

Google BigQuery

Read more about Scalable Python on BigQuery using Dask and NVIDIA GPUs

Performance considerations for loading data into BigQuery

Jul 15, 2022 By Jit Biswas In Google BigQuery

It is not unusual for customers to load very large data sets into their enterprise data warehouse. Whether you are doing an initial data ingestion with hundreds of TB of data or incrementally loading from your systems of record, performance of bulk inserts is key to quicker insights from the data. The most common architecture for batch data loads uses Google Cloud Storage(Object storage) as the staging area for all bulk loads.

Read Post