BigQuery Pipe Syntax Flow

A visual guide to the sequential flow-based syntax for cleaner, more readable SQL

Made by Seongyun Byeon
(Google Cloud Champion Innovator)
Core Data Flow
🔄
Starting with FROM
Begin your pipe flow with the data source
FROM table_name
Data Filtering with WHERE
Filter rows based on conditions
FROM table_name |> WHERE column1 > 100 AND column2 = 'value'
Column Selection with SELECT
Select specific columns from the input
FROM table_name |> WHERE condition |> SELECT column1, column2
Adding Columns with EXTEND
Add new computed columns while keeping existing ones
FROM table_name |> EXTEND column1 * 2 AS doubled_value, CONCAT(col1, col2) AS combined
Tip: Unlike standard SQL, pipe syntax flows in a logical order, making complex transformations easier to read and maintain.
Aggregation & Grouping
📊
Full Table Aggregation
Aggregate across all rows
FROM sales_data |> AGGREGATE COUNT(*) AS total_records, SUM(amount) AS total_amount
Grouped Aggregation
Aggregate with grouping
FROM sales_data |> AGGREGATE SUM(amount) AS total_sales, AVG(amount) AS avg_sale GROUP BY region, product_category
Group and Order By
Group and order in one step
FROM sales_data |> AGGREGATE SUM(amount) AS total_sales GROUP AND ORDER BY region, category
Ordering with ASC/DESC
Apply order directions to columns
FROM sales_data |> AGGREGATE SUM(amount) AS total_sales GROUP BY region ASC, category DESC
Tip: The GROUP AND ORDER BY clause is a shorthand for ordering by all grouping columns.
Table Operations
🔗
Table Aliases
Assign an alias to a table for joins
FROM orders |> AS o
Simple Join
Join two tables with conditions
FROM orders |> AS o |> JOIN customers AS c ON o.customer_id = c.id
Different Join Types
LEFT, RIGHT, FULL OUTER, CROSS joins
FROM orders AS o |> LEFT JOIN customers AS c ON o.customer_id = c.id -- Other join types |> RIGHT JOIN products AS p ON o.product_id = p.id
Set Operations
UNION, INTERSECT, EXCEPT
FROM active_customers |> UNION ALL (SELECT * FROM new_customers) |> EXCEPT DISTINCT (SELECT * FROM churned_customers)
Tip: When working with joins, use the AS operator to create aliases for clearer queries.
Advanced Operations
⚙️
Pivoting Data
Transform rows into columns
FROM monthly_sales |> PIVOT(SUM(amount) FOR month IN ('Jan', 'Feb', 'Mar', 'Apr'))
Unpivoting Data
Transform columns into rows
FROM quarterly_sales |> UNPIVOT(sales FOR quarter IN (Q1, Q2, Q3, Q4))
Window Functions
Add analytics without changing row count
FROM sales_data |> WINDOW ROW_NUMBER() OVER( PARTITION BY region ORDER BY amount DESC ) AS rank |> WHERE rank <= 3 -- Top 3 per region
Tip: Pipe syntax allows you to express complex transformations as a series of simpler steps, improving readability.
Common Pattern: ETL Pipeline
Extract
Transform
Load
FROM source_data |> WHERE is_valid = TRUE -- Filter invalid data |> EXTEND PARSE_TIMESTAMP('%Y-%m-%d', date_string) AS timestamp, UPPER(category) AS category_normalized |> DROP date_string -- Remove raw date string |> AGGREGATE COUNT(*) AS count, SUM(amount) AS total_amount GROUP BY DATE(timestamp), category_normalized |> ORDER BY DATE(timestamp), total_amount DESC
Common Pattern: Analytics Dashboard
Analysis
Visualization
KPIs
FROM sales |> EXTEND EXTRACT(YEAR FROM date) AS year, EXTRACT(MONTH FROM date) AS month |> AGGREGATE SUM(amount) AS monthly_sales GROUP BY year, month, region |> WINDOW LAG(monthly_sales) OVER( PARTITION BY region ORDER BY year, month ) AS prev_month_sales |> EXTEND (monthly_sales - prev_month_sales) / prev_month_sales * 100 AS growth_pct |> PIVOT(AVG(growth_pct) FOR region IN ('North', 'South', 'East', 'West'))