Recommendations Engine
The recommendations engine is the part of QueryWise that turns observation into action. It runs three parallel paths — pattern detection, AI deep analysis, and right-sizing — and combines them into a single ranked recommendation feed.
What a recommendation is
Every recommendation has the same shape:
{
"id": "...",
"category": "query_tuning",
"severity": "high",
"title": "Convert wide SELECT * to projected columns",
"description": "...",
"evidence": { ... },
"remediation": "ALTER VIEW ...",
"estimated_monthly_savings": 1240.00,
"confidence": 0.78,
"source": "detector",
"vendor": "snowflake",
"affected_resource": "ACCOUNT.SCHEMA.TABLE"
}
source is one of detector, ai, right_sizing. confidence for detector-generated recs is a fixed score per pattern; for AI-generated recs it's the agent's self-reported confidence (gated to ≥0.4).
Path 1: Anti-pattern detector library
The detector library lives in qw_shared.anti_patterns and currently has 130+ detectors. Categories:
- Generic SQL — SELECT *, missing predicates, Cartesian joins, N+1, unused indexes, repeated stale duplicates
- Snowflake — warehouse undersized, poor clustering, missing partitioning, serverless candidates, auto-suspend tuning
- BigQuery — slot waste, partition pruning misses, materialized view candidates, hidden cost traps
- Redshift — WLM queueing, dist key skew, vacuum needs, compile overhead, concurrency scaling
- Databricks / Spark — shuffle-heavy, broadcast candidates, oversized warehouses, missing liquid clustering
- Synapse — data movement, columnstore inefficiency, DWU overprovisioning, resource class mismatch
- MongoDB — pipeline ordering, shard imbalance, cross-partition queries, scan limits
- Cosmos DB — RU hotspots, low index hit rate, large document scans, autoscale misfits
- Cortex / AI — Cortex user sprawl, Cortex Search inefficiency, Cortex Analyst overuse, document parsing redundancy
- AI / LLM — output verbosity, repeated processing, model tiering candidates, batch API opportunities, idle endpoints, prompt caching, embedding cache misses
- Pipelines — idle Snowpipes, over-provisioned Glue DPUs, DLT cold-start tax, Dataflow streaming with low input rate, Data Factory copy activity not pushing down (20 detectors total)
- Vector / RAG — embedding-side cost patterns (27 detectors)
Each detector returns an AntiPatternMatch with severity, affected queries/resources, evidence, and a remediation snippet. The recommendation generator filters out duplicates, attaches savings estimates, and writes the recs to the recommendations table.
Path 2: AI deep analysis
For queries pattern detection can't help with, the deep analysis flow hands the query off to the analysis agent. The agent has tools to:
fetch_query_stats— execution stats for the queryfetch_query_plan— the most recent explain planfetch_schema_metadata— schema for tables touched by the queryfetch_billing_context— cost context for the warehouse / clusterfetch_existing_recommendations— to dedupe against open recssave_ai_recommendation— the structured output sink
The agent runs an agentic loop, capped by:
- Per-tenant budget — daily token / call cap
- Per-user rate limit — 5 deep analyses per user per day
- Confidence threshold — recs below 0.4 are dropped
- Plausibility cap — savings estimates >$50,000/month are flagged for review
- Audit log — every run is recorded in
analysis_audit_logs
The agent is deliberately scoped — it can't write to anything other than the recommendations table and audit log.
Path 3: Right-sizing
Right-sizing is its own path because it doesn't operate on queries — it operates on compute primitives. The detectors here read load history, concurrency, and idle time across the daily curve, then compare against the vendor's tier catalog (Snowflake warehouse sizes, Databricks node types, RDS classes, etc.).
A right-sizing rec has the same shape as the others but the evidence field includes the load curve and concurrency stats so the workload owner can sanity-check the recommendation before applying it.
Severity & ranking
Severities: critical | high | medium | low. Severity is determined by:
- Estimated monthly savings (heuristic ceilings: critical >= $5k, high >= $1k, medium >= $250)
- Risk of not applying (idle endpoint billed 24/7 → high)
- Confidence (low-confidence recs get severity downgraded)
The default sort in the dashboard is estimated savings descending, but you can re-sort by category, vendor, or age.
Savings estimation
Savings come from savings_estimator.py, which knows each vendor's pricing model:
- Snowflake credits → USD with the customer's contract rate (configurable)
- Databricks DBU → USD by SKU
- BigQuery slot-seconds and bytes scanned
- AWS RDS instance-hour pricing
- Cloud-native rates for compute
If your contract rate isn't configured, we use list price as the default and clearly mark the estimate.
Confidence calibration
After a recommendation is applied, the before/after impact panel computes actual savings from billing data. If the actual is below 50% of estimated, the recommendation is marked overstated and the savings estimator's bias is updated for similar future recs. This is what we mean by calibration.
Apply workflows
Three apply paths:
- Manual — copy the remediation, deploy yourself
- Auto-apply policy — define a policy in Settings → Auto-Apply scoped by category, severity, and vendor
- Ticket sync — push as JIRA / ServiceNow ticket; bi-directional status sync
DDL changes default to ticket sync. Auto-apply is enabled by default for low-risk categories like auto-suspend tuning.
Where to next
- Workflows & Integrations — JIRA / ServiceNow / Slack setup.
- Agents & Automation — what the AI agents can and can't do.