BigQueryCloud Cost OptimisationGoogle CloudData Engineering

BigQuery Cost Optimisation: A Practical Guide for Growing Teams

A hands-on guide to controlling BigQuery costs as your data volumes grow — covering pricing models, partitioning strategies, materialised views, and monitoring approaches that actually work.

Meisam Ebrahimi20 March 20266 min read

If you've been running analytics on BigQuery for a while, you've probably noticed something familiar: costs that seemed negligible at first have started climbing steadily. The queries that once scanned a few gigabytes now chew through terabytes, and your monthly bill has become a recurring source of concern in finance reviews.

This isn't unusual. BigQuery's pay-per-query model is brilliantly simple to get started with, but that same simplicity can mask inefficiencies until they become expensive habits. The good news is that most teams can reduce their BigQuery spend by 40–70% with a handful of well-targeted changes.

Why Costs Spiral in the First Place

BigQuery charges primarily based on the volume of data scanned by each query. When a team is small and datasets are modest, this barely registers. But several common patterns cause costs to accelerate faster than data growth alone would suggest.

SELECT \ queries are the most frequent culprit. BigQuery is columnar — it only reads the columns you actually reference. A SELECT forces it to scan every column in the table, which can be orders of magnitude more expensive than selecting only what you need.

Unbounded date ranges are another common issue. Without explicit date filters on partitioned tables, BigQuery scans the entire table history. A dashboard that refreshes every hour and looks at "all time" data can rack up enormous costs without anyone noticing.

Duplicated logic across multiple queries and dashboards also compounds the problem. If five different Looker Studio dashboards each compute the same aggregation from raw event data, you're paying for that computation five times over.

Slot-Based vs On-Demand Pricing

BigQuery offers two fundamental pricing models, and choosing the right one — or combining them — is the single most impactful cost decision you'll make.

On-demand pricing charges per terabyte scanned (currently around $6.25 per TB in most regions). This model is excellent for unpredictable, bursty workloads and small-to-medium teams. You pay nothing when nobody is querying.

Capacity-based pricing (formerly flat-rate, now called BigQuery Editions) lets you purchase dedicated compute slots. With the Standard, Enterprise, and Enterprise Plus editions, you commit to a certain number of slots and pay a predictable monthly fee regardless of how much data your queries scan.

The crossover point varies, but as a rough guide: if your on-demand spend exceeds $10,000 per month consistently, it's worth modelling what the equivalent slot commitment would cost. Many teams find that a baseline of committed slots for scheduled workloads, combined with on-demand for ad-hoc queries, delivers the best balance of cost and flexibility.

Partition and Cluster Your Tables

If you take away one thing from this article, let it be this: every large table should be partitioned, and most should be clustered too.

Partitioning divides a table into segments based on a column value — typically a date or timestamp. When a query includes a filter on the partition column, BigQuery only scans the relevant partitions rather than the entire table. For time-series data, this alone can reduce costs by 90% or more.

CREATE TABLE my_dataset.events
PARTITION BY DATE(event_timestamp)
CLUSTER BY user_id, event_type
AS SELECT * FROM my_dataset.raw_events

Clustering sorts data within each partition by the columns you specify (up to four). When queries filter or aggregate on clustered columns, BigQuery can skip irrelevant data blocks. Clustering is free to apply and has no downside — it's essentially a performance and cost optimisation you get for nothing.

A practical approach is to partition by the date column that most queries filter on, and cluster by the columns that appear most frequently in WHERE clauses and GROUP BY expressions.

Materialised Views for Repeated Aggregations

If multiple dashboards or queries compute the same aggregation from the same base table, materialised views are your best friend. A materialised view pre-computes and stores the result of a query, then automatically keeps it up to date as the underlying data changes.

CREATE MATERIALIZED VIEW my_dataset.daily_user_summary
PARTITION BY event_date
CLUSTER BY country
AS
SELECT
  DATE(event_timestamp) AS event_date,
  country,
  COUNT(DISTINCT user_id) AS unique_users,
  COUNT(*) AS total_events
FROM my_dataset.events
GROUP BY 1, 2

BigQuery is clever enough to recognise when a query against the base table could be answered by the materialised view instead, and will route it automatically. This means existing dashboards benefit without any changes to their queries.

The cost savings can be dramatic. Instead of five dashboards each scanning a 10 TB events table to compute daily summaries, they all read from a materialised view that might be just a few megabytes.

Monitoring and Governance

Cost optimisation isn't a one-off exercise — it requires ongoing visibility. Here are the monitoring patterns we set up for every client.

INFORMATION_SCHEMA queries are your first line of defence. BigQuery exposes detailed job metadata that lets you identify the most expensive queries, the heaviest users, and the tables that get scanned most often.

SELECT
  user_email,
  COUNT(*) AS query_count,
  ROUND(SUM(total_bytes_billed) / POW(1024, 4), 2) AS total_tb_billed,
  ROUND(SUM(total_bytes_billed) / POW(1024, 4) * 6.25, 2) AS estimated_cost_usd
FROM region-eu.INFORMATION_SCHEMA.JOBS
WHERE creation_time > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
  AND job_type = 'QUERY'
GROUP BY user_email
ORDER BY total_tb_billed DESC
LIMIT 20

Custom cost controls should be configured at both the project and user level. BigQuery lets you set maximum bytes billed per query, which prevents any single runaway query from blowing your budget. We typically set a project-level default of 1 TB per query and require explicit overrides for larger scans.

Scheduled cost reports using Cloud Monitoring or a simple scheduled query that writes to a summary table give leadership teams the visibility they need without requiring them to understand BigQuery internals.

Quick Wins to Implement This Week

If you're looking for immediate impact, here's a prioritised checklist:

Audit your top 20 most expensive queries from the last 30 days using INFORMATION_SCHEMA
Replace SELECT * with explicit column lists in all scheduled queries and views
Add partition filters to every query that touches a partitioned table
Create materialised views for any aggregation that runs more than once per day
Set maximum bytes billed limits at the project level
Review your pricing model — if on-demand spend is consistently high, model the Editions pricing

When to Consider Professional Help

Most teams can handle the optimisations above independently. But if your BigQuery environment has grown organically over several years, involves dozens of datasets with unclear ownership, or sits at the centre of a complex dbt or Airflow pipeline, it's often more cost-effective to bring in someone who's done this many times before.

At Alpha Array, we typically start with a cost audit that maps your current spend to specific queries, users, and tables. From there, we implement the technical changes — partitioning, clustering, materialised views, slot modelling — and set up the monitoring that keeps costs under control long-term. Most clients see a 40–60% reduction in their BigQuery bill within the first month.

If your cloud data costs are growing faster than your business, we'd be happy to take a look. Book a free discovery call and we'll walk through your situation together.

Back to Blog