Recommendations Engine

The recommendations engine is the part of QueryWise that turns observation into action. It runs three parallel paths — pattern detection, AI deep analysis, and right-sizing — and combines them into a single ranked recommendation feed.

What a recommendation is

Every recommendation has the same shape:

{
  "id": "...",
  "category": "query_tuning",
  "severity": "high",
  "title": "Convert wide SELECT * to projected columns",
  "description": "...",
  "evidence": { ... },
  "remediation": "ALTER VIEW ...",
  "estimated_monthly_savings": 1240.00,
  "confidence": 0.78,
  "source": "detector",
  "vendor": "snowflake",
  "affected_resource": "ACCOUNT.SCHEMA.TABLE"
}

source is one of detector, ai, right_sizing. confidence for detector-generated recs is a fixed score per pattern; for AI-generated recs it's the agent's self-reported confidence (gated to ≥0.4).

Path 1: Anti-pattern detector library

The detector library lives in qw_shared.anti_patterns and currently has 130+ detectors. Categories:

Generic SQL — SELECT *, missing predicates, Cartesian joins, N+1, unused indexes, repeated stale duplicates
Snowflake — warehouse undersized, poor clustering, missing partitioning, serverless candidates, auto-suspend tuning
BigQuery — slot waste, partition pruning misses, materialized view candidates, hidden cost traps
Redshift — WLM queueing, dist key skew, vacuum needs, compile overhead, concurrency scaling
Databricks / Spark — shuffle-heavy, broadcast candidates, oversized warehouses, missing liquid clustering
Synapse — data movement, columnstore inefficiency, DWU overprovisioning, resource class mismatch
MongoDB — pipeline ordering, shard imbalance, cross-partition queries, scan limits
Cosmos DB — RU hotspots, low index hit rate, large document scans, autoscale misfits
Cortex / AI — Cortex user sprawl, Cortex Search inefficiency, Cortex Analyst overuse, document parsing redundancy
AI / LLM — output verbosity, repeated processing, model tiering candidates, batch API opportunities, idle endpoints, prompt caching, embedding cache misses
Pipelines — idle Snowpipes, over-provisioned Glue DPUs, DLT cold-start tax, Dataflow streaming with low input rate, Data Factory copy activity not pushing down (20 detectors total)
Vector / RAG — embedding-side cost patterns (27 detectors)

Each detector returns an AntiPatternMatch with severity, affected queries/resources, evidence, and a remediation snippet. The recommendation generator filters out duplicates, attaches savings estimates, and writes the recs to the recommendations table.

Path 2: AI deep analysis

For queries pattern detection can't help with, the deep analysis flow hands the query off to the analysis agent. The agent has tools to:

fetch_query_stats — execution stats for the query
fetch_query_plan — the most recent explain plan
fetch_schema_metadata — schema for tables touched by the query
fetch_billing_context — cost context for the warehouse / cluster
fetch_existing_recommendations — to dedupe against open recs
save_ai_recommendation — the structured output sink

The agent runs an agentic loop, capped by:

Per-tenant budget — daily token / call cap
Per-user rate limit — 5 deep analyses per user per day
Confidence threshold — recs below 0.4 are dropped
Plausibility cap — savings estimates >$50,000/month are flagged for review
Audit log — every run is recorded in analysis_audit_logs

The agent is deliberately scoped — it can't write to anything other than the recommendations table and audit log.

Path 3: Right-sizing

Right-sizing is its own path because it doesn't operate on queries — it operates on compute primitives. The detectors here read load history, concurrency, and idle time across the daily curve, then compare against the vendor's tier catalog (Snowflake warehouse sizes, Databricks node types, RDS classes, etc.).

A right-sizing rec has the same shape as the others but the evidence field includes the load curve and concurrency stats so the workload owner can sanity-check the recommendation before applying it.

Severity & ranking

Severities: critical | high | medium | low. Severity is determined by:

Estimated monthly savings (heuristic ceilings: critical >= $5k, high >= $1k, medium >= $250)
Risk of not applying (idle endpoint billed 24/7 → high)
Confidence (low-confidence recs get severity downgraded)

The default sort in the dashboard is estimated savings descending, but you can re-sort by category, vendor, or age.

Savings estimation

Savings come from savings_estimator.py, which knows each vendor's pricing model:

Snowflake credits → USD with the customer's contract rate (configurable)
Databricks DBU → USD by SKU
BigQuery slot-seconds and bytes scanned
AWS RDS instance-hour pricing
Cloud-native rates for compute

If your contract rate isn't configured, we use list price as the default and clearly mark the estimate.

Confidence calibration

After a recommendation is applied, the before/after impact panel computes actual savings from billing data. If the actual is below 50% of estimated, the recommendation is marked overstated and the savings estimator's bias is updated for similar future recs. This is what we mean by calibration.

Apply workflows

Three apply paths:

Manual — copy the remediation, deploy yourself
Auto-apply policy — define a policy in Settings → Auto-Apply scoped by category, severity, and vendor
Ticket sync — push as JIRA / ServiceNow ticket; bi-directional status sync

DDL changes default to ticket sync. Auto-apply is enabled by default for low-risk categories like auto-suspend tuning.

Where to next

Workflows & Integrations — JIRA / ServiceNow / Slack setup.
Agents & Automation — what the AI agents can and can't do.