Tuesday, December 2, 2025

thumbnail

Performing OLAP Queries with BigQuery on Cloud SQL Federated Tables

 Performing OLAP Queries with BigQuery on Cloud SQL Federated Tables


Google BigQuery allows you to run analytical (OLAP) queries directly on operational data stored in Cloud SQL — without copying or moving the data — using federated queries.


This helps you analyze transactional data in real time while avoiding ETL overhead.


๐Ÿš€ What Is a Federated Query?


A federated query lets BigQuery access external data sources like:


Cloud SQL (MySQL, PostgreSQL, SQL Server)


Cloud Storage


Bigtable


Spanner


When you create a federated table, BigQuery queries the data in Cloud SQL on the fly.


๐Ÿ—️ Architecture Overview

BigQuery  <---- federated table ---->  Cloud SQL

   |

 OLAP SQL (JOIN, GROUP BY, CUBE, ROLLUP, analytics functions)



BigQuery becomes the OLAP engine; Cloud SQL remains the OLTP system.


๐Ÿ› ️ Steps to Use BigQuery for OLAP on Cloud SQL Tables

1️⃣ Create a Cloud SQL Connection in BigQuery


In BigQuery:


CREATE CONNECTION my_connection

OPTIONS (

  connection_type = "CLOUD_SQL",

  instance_id = "my-sql-instance",

  database = "mydb",

  username = "bq_user",

  password = "password"

);



This creates a secure connection to Cloud SQL.


2️⃣ Create a Federated Table in BigQuery


Example:


CREATE EXTERNAL TABLE `my_dataset.orders_ext`

WITH CONNECTION `my-connection`

OPTIONS (

  table_name = "orders"

);



This logical table represents the table in Cloud SQL.


3️⃣ Run OLAP Queries in BigQuery


You can now run full BigQuery SQL (including OLAP functions) on Cloud SQL data.


๐Ÿ“ˆ OLAP Query Examples

1️⃣ Aggregations (GROUP BY)

SELECT

  customer_id,

  SUM(amount) AS total_spent,

  COUNT(*) AS num_orders

FROM `my_dataset.orders_ext`

GROUP BY customer_id

ORDER BY total_spent DESC;


2️⃣ Time-Series Analysis

SELECT

  DATE(order_timestamp) AS order_date,

  SUM(amount) AS daily_sales

FROM `my_dataset.orders_ext`

GROUP BY order_date

ORDER BY order_date;


3️⃣ Window Functions

SELECT

  customer_id,

  amount,

  SUM(amount) OVER (PARTITION BY customer_id ORDER BY order_timestamp)

     AS running_total

FROM `my_dataset.orders_ext`;


4️⃣ CUBE / ROLLUP for Multi-Dimensional OLAP

ROLLUP Example

SELECT

  region,

  product,

  SUM(sales) AS total_sales

FROM `my_dataset.sales_ext`

GROUP BY ROLLUP(region, product);


CUBE Example

SELECT

  region,

  product,

  SUM(sales) AS total_sales

FROM `my_dataset.sales_ext`

GROUP BY CUBE(region, product);



These queries generate subtotals and grand totals automatically.


5️⃣ Joining Cloud SQL With BigQuery Native Tables

SELECT

  o.order_id,

  o.amount,

  c.customer_name

FROM `my_dataset.orders_ext` AS o

JOIN `my_dataset.customers` AS c

  ON o.customer_id = c.customer_id;



This allows blending OLTP and analytical data.


⚙️ Performance Considerations


Federated queries are powerful, but keep in mind:


✔ OLAP operations run in BigQuery


All computation (aggregation, joins, windowing) is offloaded to BigQuery.


❗ But data is streamed from Cloud SQL


This may affect:


Latency


Throughput


Query cost


For large datasets, it's recommended to periodically replicate data into BigQuery (e.g., using Datastream or Dataflow).


Best use cases for federation:


Fresh, real-time reports


Medium-size OLTP tables


Ad-hoc analysis


Reducing ETL pipelines


๐Ÿ” Security and Access Control


IAM-managed BigQuery Connections


Cloud SQL private services access


No public IP exposure required


Optional VPC Service Controls


๐Ÿ’ก Best Practices


Use federated tables for real-time or small-to-medium datasets


For heavy OLAP workloads, create a materialized table in BigQuery


Filter early (WHERE clause) to reduce data transfer


Avoid large CROSS JOIN or unbounded window functions


Pre-aggregate frequently used metrics


๐Ÿ Conclusion


Using BigQuery to run OLAP queries on Cloud SQL federated tables allows you to:


Analyze live transactional data


Eliminate ETL complexity


Use BigQuery’s powerful SQL engine


Build hybrid OLAP/OLTP architectures


It’s ideal for organizations that want real-time analytics without maintaining complex data pipelines.

Learn GCP Training in Hyderabad

Read More

Using Bigtable with Grafana for Real-Time Monitoring Dashboards

Migrating PostgreSQL Databases to Cloud SQL Seamlessly

Using Firestore for Real-Time Collaborative Features

Building a Scalable Chat App with Firestore and Firebase Authentication

Visit Our Quality Thought Training Institute in Hyderabad

Get Directions 

Subscribe by Email

Follow Updates Articles from This Blog via Email

No Comments

About

Search This Blog

Powered by Blogger.

Blog Archive