DATA 2027 · Week 13 · Part IV — Frontier & Futures

Self-Driving, Self-Assembling, Self-Designing

Twenty years of databases that promise to tune themselves — and the autonomous DBA that actually shipped is an agent with a master prompt and a connection string.

Lecture 1 — From Auto-Tuning to Self-Design · Lecture 2 — The Agent as DBA

Lecture 1 · Tuesday

From Auto-Tuning to Self-Design

OtterTune’s Gaussian processes, the Data Calculator’s design continuum, and why the company died while the capability didn’t.

L1 · The 2017 claim

Automated is not autonomous

L1 · Forecast first

Autonomy is a bet on the future workload

L1 · Forecasting

QueryBot 5000: template, cluster, predict

L1 · Forecasting

A few clusters carry the workload

95%

of query volume covered by the top 5 template clusters, across the workloads QueryBot 5000 studied (SIGMOD 2018).

L1 · Forecasting

Week 13’s client doesn’t sleep

L1 · Planning

A control problem in an ML costume

L1 · The loop

The loop is the product

FORECAST cluster + predict arrivals PLAN benefit − apply cost ACT index / knob / partition OBSERVE did p99 actually move? guardrails: every action reversible · never explore on the critical path every box has been swapped out since 2017 — except the dashed line
Fig. — Pavlo et al., CIDR 2017. OtterTune swapped PLAN for a Gaussian process; the Data Calculator widened ACT; the 2026 LLM-DBA replaces the planner. The guardrail line never changed.
L1 · OtterTune

The knob problem

~350

configuration knobs in PostgreSQL (MySQL: 500+). They interact nonlinearly, and the defaults are tuned for a machine from 2008.

L1 · OtterTune

Tuning as black-box optimization

L1 · OtterTune

Postmortem: structural, not scientific

  • Founded 2020, ~$14.5M raised; shut down 2024.
  • Tuning is episodic — what does month nine’s subscription buy?
  • The channel belonged to the clouds: knob access needs provider cooperation.
  • By 2024 the capability had dissolved into the substrate.
L1 · History

The advisor-mode plateau

L1 · Data Calculator

Stop tuning structures, derive them

1032

valid two-node-type designs from ~50 layout primitives (Idreos et al., SIGMOD 2018). B-trees, LSM-trees, tries, hash tables: just the famous coordinates.

L1 · Data Calculator

Cost synthesis: design as navigation

Lecture 2 · Thursday

The Agent as DBA

The first broadly deployed autonomous DBA is not inside the engine — it holds a connection string and a master prompt.

L2 · The twist

The 2017 roadmap didn’t predict this

L2 · What changes

1 — The tuner can read

L2 · What changes

2 — Experiments on branches

L2 · What changes

Fork reality, then ask permission

PROD live traffic BRANCH CoW fork, secs REPLAY captured workload GATE falsifiable thresholds PR human merges hypothesis → experiment → result → pull request DDL only on branches · production changes only via reviewed migration PRs
Fig. — The 2026 experiment loop: the agent never touches production directly; it forks, replays, gates, and opens a PR a human can argue with.
L2 · What changes

Tuning run #47, verbatim

-- branch: tune/checkpoint-2026-06-09 (fork of prod@LSN 0/8A3F1C40)
-- hypothesis: p99 spikes align with checkpoints (~every 140s)
ALTER SYSTEM SET max_wal_size = '8GB';                 -- was 1GB
ALTER SYSTEM SET checkpoint_completion_target = 0.9;   -- was 0.5
-- replay: 30 min, 14,212 statements, agent-traffic mix 71%
-- result: p99 412ms → 287ms · p50 38ms → 37ms
--         WAL volume +9% · recovery-time est. +6.2 min
-- gate:   p99 −20%, WAL ≤ +15%, recovery ≤ +10 min  → PASS
-- action: open PR with diff + transcript; do NOT touch prod
L2 · What changes

3 — It explains itself

L2 · What doesn’t change

Now the cold water

L2 · The harness

Guardrails, evals, rollback

L2 · Field note

The 40-minute lock

L2 · Honest ledger

Learned components, 2026 scorecard

ComponentStatus 2026Why
Automatic indexingProduction, fleet-scaleAzure SQL since 2019; verify + auto-revert
Knob tuningAbsorbed into platformsOtterTune dead 2024; lives as defaults
Optimizer steeringProduction, narrowBao-style hints; classical optimizer as floor
Learned cardinalityAdvisor-mode / labWins benchmarks; loses on drift, tail risk
Learned indexesNicheAbsorbed into LSM parts; B-tree undefeated
LLM agent as DBAEarly production, gatedBranch-only DDL, evals, human-merged PRs
L2 · The pattern

The cost of being wrong

Autonomy never shipped as a product you buy. It shipped as a layer you stop noticing.
— Week 13 lecture notes
Checkpoint · Discussion

Before you leave

Readings · Week 13

Read before Thursday