Right-Sizing and Auto-Suspend
Right-sizing is the headline FinOps win. It's also where most teams over-correct and break a workload — usually the same week they over-correct.
The QueryWise approach is simple: never resize without seeing the workload, and never auto-suspend without understanding what's running between bursts.
The signals we look at
Three numbers drive every right-sizing decision:
- Utilization — how busy is the compute when it's running?
- Concurrency — how many concurrent queries does it actually handle?
- Idle time — how much of the billed wall-clock is doing nothing?
If you only look at utilization, you'll resize a warehouse that's queueing — sized correctly but starved by concurrency. If you only look at idle time, you'll set aggressive auto-suspend on a warehouse whose hot cache is the reason queries are fast.
QueryWise pulls these signals separately. On the Usage page you see them per resource over time; the recommendation engine combines them into right-sizing suggestions.
Compute right-sizing rules
The detector library has dedicated right-sizing rules for each compute primitive:
- Snowflake warehouses — uses
WAREHOUSE_LOAD_HISTORYandWAREHOUSE_METERING_HISTORYto detect over-sized + under-loaded warehouses, queueing-but-not-busy warehouses (concurrency, not size, is the bottleneck), and serverless candidates. - Databricks clusters — flags clusters where photon could replace a larger node count, autoscale ranges that never hit the upper bound, and serverless SQL warehouse candidates.
- Redshift clusters — detects RA3 over-provisioning by hour, concurrency scaling underuse, and node-count miscalibration.
- Synapse pools — DWU overprovisioning across the daily curve, resource class mismatch.
- Cosmos DB — RU overprovisioning, autoscale ranges that don't fit usage, partition hotspots that look like global RU pressure.
- RDS / Cloud SQL / Azure SQL — instance-class right-sizing using CloudWatch / Cloud Monitoring / Azure Monitor.
Each recommendation includes a savings estimate priced against the vendor's tier catalog, plus the risk of the change ("low — no concurrency pressure observed in the last 30 days" vs "medium — concurrency peaked at 80% of current cluster on 2 days").
Auto-suspend without blowback
For Snowflake specifically, auto-suspend is the change everyone wants to make and the one that breaks the most workloads. A few rules from the field:
- Don't set auto-suspend uniformly. A warehouse that serves a Looker dashboard at 9am every weekday is not the same workload as a dbt model that runs every 15 minutes.
- Watch the warm-cache effect. If a warehouse's first query after resume is consistently 4× slower than subsequent queries, aggressive auto-suspend will tax every burst.
- Use multi-cluster instead of low auto-suspend for spiky concurrency. You pay for the burst, not for keeping a too-large warehouse warm.
QueryWise's auto-suspend detector reads the load history first and only flags candidates where the math actually works out — not just "warehouse spent 42% of the day idle."
Rolling out changes
The temptation is to apply right-sizing in one sweep. Don't.
Our recommended sequence:
- Showback first. Add the right-sizing recommendation to a JIRA/ServiceNow ticket via the integration. Let the workload owner see it.
- Test in non-prod. For warehouse/cluster changes, run a workload replay against the smaller size for one full business cycle.
- Auto-apply only the safe set. In Settings → Auto-Apply, you can scope auto-apply to specific severities and categories. We default to "auto-suspend tweaks below 60s threshold change, low-confidence-risk only."
- Confirm with before/after. Every applied recommendation gets a before/after impact panel. If the actual savings come in below 50% of estimated, the recommendation is marked overstated and the engine recalibrates.
What workload-aware means
The phrase "workload-aware right-sizing" is doing real work in QueryWise. It means the recommendation knows:
- What query patterns ran on this compute
- Which patterns drove the cost
- Whether sizing or concurrency is the constraint
- How the warehouse behaves through the daily curve, not just on average
That's the difference between "this warehouse is over-sized" (often wrong) and "this warehouse hits 30% utilization for 6 hours and 90% utilization for the 30-minute window before standup — split it" (usually right).
What we don't auto-apply
Right-sizing is high-leverage and high-risk. We deliberately leave these manual:
- Going up in size on the basis of queueing alone
- Cluster topology changes (Redshift node count, Databricks node count)
- Anything that requires a redeploy or restart
These get the ticket-sync workflow instead, with the right-size suggestion attached.
Where to next
- Query Optimization Playbook — sometimes the right answer is to fix the query, not the warehouse.
- Pipeline Cost Control — the compute powering your pipelines deserves the same treatment.