DATA 2027: Data Systems in the Agentic Era

Prologue

Field Notes: What the Flagships Actually Teach

Before inventing a course, survey the territory. We read the current syllabi of every flagship database course at Stanford, MIT, Harvard, CMU, and Berkeley — the actual 2025–2026 lecture schedules, not the catalog blurbs. The result is a snapshot of a discipline mid-pivot: the AI era has breached the core curriculum at exactly two schools, is knocking at a third, and has left the other two split between classical rigor and a vacated chair.

MIT

6.5830 · SPRING 2026 · CAFARELLA & LI

The bellwether. The core grad DB course now ends its 24-lecture arc with Lecture 17: Vector Databases and Lecture 22: Semantic Operators — LLM-powered relational operators, taught from the group’s own Palimpzest (CIDR ’25), have entered the canon alongside ARIES and C-Store. Kraska is on leave running applied science at AWS, where SageDB’s learned layouts shipped inside Redshift.

Harvard

CS265 · SPRING 2026 · IDREOS

The boldest rename: CS265 is now literally “Big Data & AI Systems” — NoSQL, LLMs, RAG, image AI. The systems project: build a full NoSQL engine and a full LLM, in C/C++. DASlab’s “self-designing systems” program (Data Calculator, Monkey, Dostoevsky) now treats RAG pipelines and LLM serving as data systems with navigable design spaces.

CMU

15-445/645 · SPRING 2026 · 15-721 · FALL 2025

Vector indexes are now in the intro indexing lectures of 15-445 — commoditized, as Pavlo predicted. But 15-721 (Advanced, now under Jignesh Patel) stays deliberately classical: vectorized execution, OCC, cloud warehouses, lakehouses. Pavlo’s 2025 year-in-review: every DBMS shipped an MCP server, agents provision the databases, and nobody has solved the security story.

Stanford

CS245 DORMANT · CS336/224V/224G ACTIVE

The famous data-intensive systems course is effectively frozen at its 2021 edition — Zaharia left for Berkeley and the Databricks CTO chair. The energy moved sideways: CS336 makes students build Common Crawl dedup pipelines (data engineering as part of the model), CS224V teaches SUQL’s hybrid SQL+retrieval queries, and the LOTUS semantic-operators work (VLDB ’25) came from the Zaharia–Guestrin orbit.

Berkeley

CS294 RDI MOOCS · EPIC LAB

No flagship grad DB course since 2020 — but the world’s largest agent courses (LLM Agents, Agentic AI; five-thousand-student MOOCs) and the EPIC lab’s DocETL (VLDB ’25), the third leg of the semantic-operator stool. Hellerstein’s Aqueduct pivoted to RunLLM; his verdict: enterprise picks-and-shovels LLM tooling “is not ready yet.”

The Gap

NOBODY TEACHES THE WHOLE STACK

Vector internals live at CMU, semantic operators at MIT, self-designing systems at Harvard, data curation at Stanford, agents at Berkeley. No single course teaches the agentic data stack end-to-end — storage physics to semantic layer to agent governance. That course is below.

One quote from the field survey deserves framing, because it is the entire reason the course below exists. Michael Stonebraker — Turing laureate, creator of Ingres and Postgres — ran text-to-SQL against MIT’s own data warehouse in 2025 and reported:

“An accuracy of zero — not low, zero.” Private data, idiosyncratic terminology, semantic overlap, complex queries. His corollary: agentic AI is about to go read-write, and what it needs is ACID — “durable computing is basically the D in ACID.” — Michael Stonebraker, 2025 year-in-review interview; his DBOS project has pivoted to durable execution for “crashproof AI agents”

Zero on MIT’s warehouse; ninety-five percent at Anthropic with curated context. The entire pedagogy of the agentic era lives in the space between those two numbers — and no course on Earth currently teaches students how to cross it.

Section 1

The Course

MIT 6.5837 · Stanford CS 345A · Harvard CS 2650 — Spring 2027

DATA 2027: Data Systems in the Agentic Era

Units / Pattern

12 units (3-0-9) · Tu/Th 1:00–2:30 lecture · Fri 90-min systems lab

Prerequisites

An undergrad DB internals course (6.5830, CS 245, CS 165); systems programming in Rust/C++/Go; fluency with LLM APIs. Comfort reading two papers a week unassisted.

Instructor archetype

A database-internals veteran who shipped a commercial storage engine, took a two-year detour through an AI lab, and came back angry about both communities’ blind spots. Office hours run like CIDR hallway arguments.

Audience

PhD students in data systems and ML systems; MEng students who want to build the next Snowflake or kill the current one; staff engineers on sabbatical.

For fifty years, every database design decision — the buffer pool, the optimizer’s cost model, the isolation-level menu, SQL itself — quietly assumed a human at the other end of the connection: tens of queries per session, seconds of patience, intent behind every WHERE clause. That client is gone. The dominant client of the late 2020s issues thousands of speculative queries per task, branches and abandons entire database states, retrieves by similarity as often as by key, knows nothing about your schema except what fits in its context window, and lies confidently when the data model confuses it. Thesis: this is not a feature request; it is a workload regime change on the scale of the OLTP→OLAP split, and it invalidates assumptions at every layer of the stack. This course rebuilds database systems knowledge from storage engines upward, asking at each layer what survives, what bends, and what breaks when the client is a model.

The arc follows the stack: Part I re-reads the classical canon as a set of workload bets about to be violated. Part II covers the new access methods and engine architectures. Part III climbs to where the accuracy actually lives — semantics, agents, governance. Part IV asks the field’s oldest question of its newest material.

Section 2

Fourteen Weeks

Part I — Foundations Under New Workloads

Week01

The Client Has Changed

The classical architecture as baseline; the agentic workload as the perturbation — bursty, speculative, semantically addressed, schema-ignorant. The semester’s question: which of Hellerstein’s five components survive contact?

ReadingsArchitecture of a Database System — Hellerstein, Stonebraker & Hamilton, FnT DB 2007 (§1–4)
What Goes Around Comes Around — Stonebraker & Hellerstein
How Anthropic Enables Self-Service Data Analytics with Claude — Anthropic, June 2026

Week02

B-Trees, LSM-Trees & the RUM Triangle

B-trees and LSM-trees as the two stable attractors of fifty years of storage design. Do agent workloads — write-heavy memory accumulation plus read-heavy semantic recall — sit at a RUM corner nobody optimized for?

ReadingsDatabase Internals — Petrov, 2019, ch. 2–4 & 7
The RUM Conjecture — Athanassoulis et al., EDBT 2016
The Log-Structured Merge-Tree — O’Neil et al., 1996

Week03

One Size Fits None

Stonebraker’s 2005 polemic as method: identify the workload assumption baked into the engine, then violate it on purpose — exactly what agents do to row stores and vector stores alike.

Readings“One Size Fits All”: An Idea Whose Time Has Come and Gone — Stonebraker & Çetintemel, ICDE 2005
C-Store — VLDB 2005
MonetDB/X100: Hyper-Pipelining Query Execution — Boncz et al., CIDR 2005

Week04

Disaggregation & Elasticity

Separating compute from storage was the architectural move of the 2010s; it turns out to be the precondition for agentic workloads, whose demand curves look like seismographs.

ReadingsAmazon Aurora — Verbitski et al., SIGMOD 2017
The Snowflake Elastic Data Warehouse — Dageville et al., SIGMOD 2016
Building an Elastic Query Engine on Disaggregated Storage — NSDI 2020

Part II — New Access Methods & Engines

Week05

Learned Components

If indexes are models, what else is? The learned-index paper as provocation; Bao as the mature, deployable version — ML steering a classical optimizer rather than replacing it, a pattern that recurs all semester.

ReadingsThe Case for Learned Index Structures — Kraska et al., SIGMOD 2018
Bao: Making Learned Query Optimization Practical — Marcus et al., SIGMOD 2021
SageDB — CIDR 2019

Week06

Vector Indexes Are Access Methods, Not Products

HNSW and DiskANN are this generation’s B-tree and LSM: one memory-resident graph, one SSD-resident graph, both approximate. Recall becomes a first-class resource alongside read, update, and memory cost — foreshadowing the exam.

ReadingsHNSW — Malkov & Yashunin, IEEE TPAMI 2018
DiskANN — Subramanya et al., NeurIPS 2019
Product Quantization — Jégou et al., TPAMI 2011

Week07

The Lakehouse & Open Formats

When every agent framework wants direct Parquet access, the table format becomes the database. Iceberg’s metadata tree, Photon’s vectorized execution over open files — the engine dissolving into the lake.

ReadingsLakehouse — Armbrust et al., CIDR 2021
Apache Iceberg Table Spec v2 + Ryan Blue, Netflix 2018
Photon — Behm et al., SIGMOD 2022

Week08

Transactions & Branching for Agent Swarms

A thousand agents exploring hypotheticals need cheap forks, not just isolation levels. Calvin and Spanner as the deterministic-vs-clock poles; Neon’s copy-on-write storage as the third thing agents actually want: the database as a versioned, branchable object.

ReadingsCalvin — Thomson et al., SIGMOD 2012
Spanner — Corbett et al., OSDI 2012
Neon: Architecture Decisions — neon.tech engineering, 2022–24

Part III — Semantics, Agents, Governance

Week09

Text-to-SQL Is Not Solved; It’s Specified

Spider 2.0 and BIRD show frontier models failing on real enterprise schemas — not from weak SQL skills but from missing semantics. The semantic layer re-emerges as the schema-for-models: metrics, joins, and meaning as a queryable contract.

ReadingsSpider 2.0 — Lei et al., ICLR 2025 (oral)
BIRD — Li et al., NeurIPS 2023
Semantic Layer vs Text-to-SQL benchmarks — Cube / dbt Labs, 2026

Week10

Semantic Operators

What if sem_filter and sem_join were relational operators with cost models, not prompt spaghetti? LOTUS, Palimpzest, and DocETL converge from three directions: declarative LLM pipelines deserve an optimizer, and accuracy joins latency and cost in the objective.

ReadingsSemantic Operators (LOTUS) — Patel et al., VLDB 2025
Palimpzest — Liu et al., CIDR 2025
DocETL — Shankar et al., VLDB 2025

Week11

Memory Is a Database Problem

Agent memory systems are reinventing materialized views, temporal databases, and log compaction — with worse durability stories. We read Mem0 and Graphiti as database designs and grade them as such.

ReadingsMem0 — Chhikara et al., arXiv 2504.19413
Zep/Graphiti: Temporal KG for Agent Memory — Rasmussen et al., arXiv 2501.13956
MemGPT — Packer et al., 2023

Week12

Protocols, Permissions & the Lethal Trifecta

MCP is becoming ODBC for agents — shipping with none of ODBC’s hard-won governance. Private data + untrusted content + exfiltration channels is the new SQL injection; we design row-level security for clients that can be prompt-injected.

ReadingsModel Context Protocol Specification — 2024–26
The Lethal Trifecta — Willison, 2025 (incl. the Supabase MCP exfiltration case)
OWASP Top 10 for LLM Applications — 2025 rev.

Part IV — Frontier & Futures

Week13

Self-Driving, Self-Assembling, Self-Designing

Pavlo’s self-driving DBMS agenda predates agentic AI but predicts it; Idreos’s self-designing systems generalize it. Now the tuner is a general agent — does the DBA survive, and does the optimizer? OtterTune’s death as the cautionary case study.

ReadingsSelf-Driving Database Management Systems — Pavlo et al., CIDR 2017
The Data Calculator — Idreos et al., SIGMOD 2018
OtterTune postmortem — 2024

Week14

What Goes Around

Every data-model rebellion gets reabsorbed by the relational core — the field’s elders have run this argument for forty years and never lost. Final debate: are vectors, semantic operators, and agent memory the next XML, or the first genuine exception in fifty years?

ReadingsWhat Goes Around Comes Around… And Around — Stonebraker & Pavlo, SIGMOD Record 2024
The Seattle Report on Database Research — Abadi et al., CACM 2022

Section 3

The Labs

Four labs, each one a thesis in miniature. Graded on Pareto frontiers and ablation tables, not point estimates — the Anthropic discipline, enforced from problem set one.

Lab 1 · Weeks 2–4

VLSM: an LSM-tree with vector segments

Extend a skeleton LSM engine (Rust) so each SSTable carries a quantized vector segment with a per-level HNSW graph; implement compaction that merges graphs. Measure the read / update / memory / recall frontier under a synthetic agent workload. Deliverable: a Pareto plot, not a number.

Lab 2 · Weeks 5–8

Mini-Neon: copy-on-write pages over S3

Build a page layer over object storage: WAL ingestion, page materialization at any LSN, and CREATE BRANCH in O(metadata). Demonstrate 50 concurrent agent branches forked from one parent, with garbage collection of the abandoned ones.

Lab 3 · Weeks 9–10

Text-to-SQL agent + eval harness

Build an agent over a 126-table enterprise-style schema; construct a BIRD-style eval set; then add a Cube-style semantic layer and report execution accuracy with and without it, broken down by error class: wrong join, wrong metric, hallucinated column. Reproduce the 21%→95% curve yourself.

Lab 4 · Weeks 10–12

A semantic-operator optimizer

Implement sem_filter, sem_join, sem_topk over a 25,000-document corpus with a cost model spanning model tiers, cascades with cheap-proxy scoring, and an accuracy budget. Beat the naive plan by 10× on cost at ≤2% quality loss.

Section 4

Final Projects & Grading

Teams of two; deliverable is a CIDR-format paper plus a working artifact. The menu — each item is a publishable question wearing a project’s clothing:

Recall-aware compaction. An LSM compaction policy that co-optimizes graph quality and write amplification.

Transactional agent memory. Mem0-class memory with snapshot isolation, time travel, and an audit log; benchmark against Graphiti.

An MCP gateway with provenance-based row security. Policy enforcement that survives prompt injection. Then red-team it.

Speculative query execution. Predict an agent’s next five queries from its trace and pre-execute them on branches.

A semantic-layer compiler. From dbt/Cube definitions to model-optimized context, with token-budget-aware schema elision.

Bao-for-LOTUS. Learned plan steering for semantic-operator pipelines using execution feedback.

Branch-native CI for data. Iceberg snapshots + agent-generated data tests as a “pull request for tables” system.

The agent-native TPC. Design and publish a benchmark — workload generator plus metrics — for agentic database clients. The field needs this more than it needs another engine.

40%labs

35%final project

15%paper responses + debate

10%exam

Database courses teach the system; ML courses teach the model. Nobody teaches the interface — which is where the next decade of both fields will be decided.

Why this course exists: the agent era is producing a generation of engineers who treat the database as a vibes-based retrieval API, and a generation of database researchers who treat LLMs as a noisy UDF. Both are wrong, and the cost of being wrong is measured in hallucinated joins shipped to production and in storage engines optimized for clients that no longer exist. This course exists to produce people who can hold Petrov’s page layouts and a transformer’s context window in their head at the same time — because the systems that win the agentic era will be built by exactly those people, and there are currently about two hundred of them on Earth.

Section 5

The Synthesis Exam

Take-home, open-everything — including frontier models, because pretending otherwise teaches the wrong lesson. Each question requires holding two layers of the stack in tension; none can be answered from one community’s literature alone.

RUM-R. The RUM conjecture says you optimize two of read, update, and memory at the expense of the third. Formalize a fourth axis — recall — for approximate access methods: define the trade-off space, prove or refute a four-way impossibility analogous to the original conjecture, and place HNSW, DiskANN, and your Lab 1 engine in it.

Aurora for agents. Aurora’s claim is “the log is the database.” Neon’s is “the log is the database, and the database is a tree of branches.” Derive the write-amplification and storage-cost consequences of 10,000 speculative agent branches per hour under each architecture, and identify the workload crossover point.

Stonebraker’s wager. Using the argumentative structure of What Goes Around Comes Around… And Around (2024), write the 2034 sequel’s section on semantic operators: argue either that LOTUS-style operators get absorbed into SQL (cite the precedent) or that the accuracy/cost dimension makes them the first true escape from the relational gravity well.

The optimizer’s new objective. Classical optimizers minimize cost for a fixed-correctness answer; semantic-operator optimizers trade cost against accuracy. Show formally why System R’s dynamic programming breaks under a (cost, accuracy) partial order, and propose a plan-enumeration strategy that doesn’t.

Isolation for liars. Two agents under read-committed each read a table, summarize it into their memory stores, and act on stale summaries — a semantic phantom. Define an isolation level for derived-belief consistency, specify its versioning protocol, and argue whether it belongs in the DBMS, the memory layer, or the protocol.

❦

Appendix

Sources & Provenance

[01] MIT 6.5830 Spring 2026 — course site, lecture schedule (Lec 17 Vector DB, Lec 22 Semantic Operators); instructors Cafarella & Li.
[02] Harvard CS265 “Big Data & AI Systems,” Spring 2026, Idreos — site, syllabus; CS165 Fall 2025 — site.
[03] CMU 15-445/645 Spring 2026 — syllabus (vector indexes in intro indexing); 15-721 Fall 2025 (Patel) — site.
[04] Pavlo, “Databases in 2025: A Year in Review” (MCP everywhere; agentic provisioning; security warning).
[05] Stanford: CS245 (frozen at Winter 2021); CS336 (Common Crawl data labs); CS224V (SUQL); CS224G; CS528 MLSys seminar.
[06] Berkeley: RDI agent courses — LLM Agents F24, Agentic AI F25; EPIC lab DocETL, VLDB 2025; Hellerstein on LLM tooling, Firebolt interview.
[07] Stonebraker “accuracy of zero” + ACID-for-agents: 2025 year-in-review interview; DBOS durable execution.
[08] Semantic operators: LOTUS — Patel et al., VLDB 2025; Palimpzest — Liu et al., CIDR 2025; MIT DSG project page.
[09] Benchmarks: Spider 2.0 — ICLR 2025 oral; BIRD — NeurIPS 2023, arXiv 2305.03111.
[10] Learned components: Kraska et al. SIGMOD 2018; Bao — Marcus et al., SIGMOD 2021; SageDB — CIDR 2019; MIT DSAIL; Redshift learned layouts — Amazon Science.
[11] Vector internals: HNSW — TPAMI 2018; DiskANN — NeurIPS 2019; PQ — Jégou et al., TPAMI 2011.
[12] Canon: Hellerstein/Stonebraker/Hamilton FnT 2007; C-Store VLDB 2005; MonetDB/X100 CIDR 2005; Aurora SIGMOD 2017; Snowflake SIGMOD 2016; Lakehouse CIDR 2021; Photon SIGMOD 2022; Calvin SIGMOD 2012; Spanner OSDI 2012; Stonebraker & Pavlo, SIGMOD Record 2024.
[13] Agent memory: Mem0 — arXiv 2504.19413; Zep/Graphiti — arXiv 2501.13956; MemGPT — arXiv 2310.08560.
[14] Security: Willison, the lethal trifecta / Supabase MCP case; Supabase, Defense in Depth for MCP Servers; Datadog Security Labs Postgres MCP injection.
[15] Context engineering: Anthropic, self-service analytics (June 2026) and effective context engineering; EvalGen — Shankar et al., UIST 2024.
[16] OtterTune (2020–2024) postmortem — ottertune.com; Idreos, The Data Calculator — SIGMOD 2018; Pavlo et al., Self-Driving DBMS — CIDR 2017.