Week 14: What Goes Around

Learning objectives — after this week you can…

Reconstruct the six major data-model rebellions (hierarchical/network, OODB, XML, MapReduce, NoSQL, document) and the specific workload pain each one was right about.
State the reabsorption mechanism precisely: which lever the standard pulls (a type, an operator, an index, an optimizer rule) for each rebellion.
Steelman both sides of the exception debate: absorption evidence (pgvector, SQL/PGQ, vector-as-feature) versus exception evidence (accuracy as a result dimension, dollars per operator, non-deterministic operators).
Argue, with cost arithmetic, why semantic operators break classical optimizer soundness — and what a cost model with an accuracy term would have to look like.
Restate the course thesis in one paragraph and name three open problems worth a PhD: the agent-native TPC, derived-belief isolation, and recall-aware compaction.

Lecture 1 · Tuesday

Fifty Years of Data-Model Rebellions

Michael Stonebraker has written the same paper twice, nineteen years apart, and both times he was right. The 2005 What Goes Around Comes Around surveyed thirty-five years of data-model proposals and found a cycle; the 2024 sequel with Andy Pavlo (SIGMOD Record 53(2)) extends the survey through MapReduce, NoSQL, document stores, and graph databases, and finds the cycle still turning. The pattern is so regular you can set a tenure clock by it: a new workload exposes a real weakness in relational systems; a rebellion declares the relational model dead and ships a new data model with a new query language; the rebellion spends a decade rediscovering transactions, schemas, and cost-based optimization; meanwhile SQL absorbs the one genuinely new thing — usually as a type and an index — and the rebellion survives, if at all, as a niche engine or a column in Postgres. Today we walk the laps. Thursday we ask whether the lap we are standing in is different.

Lap one: hierarchy, navigation, and Codd’s rebellion-that-stuck

The first rebellion ran in the opposite direction — it was the relational model rebelling against the incumbents. IBM’s IMS (1968) organized records as trees; CODASYL (1969) generalized to networks of records linked by pointers. Both forced the programmer to navigate: to answer “which suppliers ship part 47?” you wrote a loop of GET NEXT WITHIN PARENT calls whose correctness depended on the physical pointer layout. Change the layout, rewrite the application. Codd’s 1970 paper made one move that mattered more than the math: data independence — programs name the data they want, the system decides how to get it. System R and Ingres (mid-1970s) proved an optimizer could pick the navigation for you, usually better than you. Every subsequent rebellion can be scored by a single question: did it give up data independence, and if so, what did it buy with the proceeds?

Laps two through four: objects, angle brackets, and the brute-force decade

The OODB wave (late 1980s: GemStone, ObjectStore, O2; the ODMG standard in 1993) answered the “impedance mismatch” between C++ objects and tables by persisting the objects directly. It gave up declarative queries and a stable schema boundary, and it tied your data to one application’s class hierarchy — your database now broke when you refactored. Market verdict: a rounding error of the RDBMS market by 2000. SQL’s counter-move: SQL:1999 added structured types and Postgres added extensible types; the mismatch got patched at the language layer by ORMs instead.

The XML wave (roughly 1998–2008) is the purest specimen, which is why Thursday’s debate keeps invoking it. XML had everything the vector wave has now: enterprise mandates, a W3C standards stack (XML Schema, XPath, XQuery 1.0 in 2007), venture-funded native engines, and a claim that this time the data really is different — semi-structured, self-describing, document-shaped. The relational response was surgical: SQL/XML (2003) added an XML type and publishing functions; engines added XML indexes; the native XML database market collapsed into the RDBMS feature list. Note what got absorbed and what got discarded: the type survived, the data model as universal organizing principle did not. JSON then re-ran the same loop in fast-forward — SQL:2016 added JSON operators, SQL:2023 a native JSON type — because the second time around, the incumbents knew the choreography.

MapReduce (Google, 2004) rebelled against the engine rather than the model: no schema, no indexes, no optimizer, no declarative language — just scale. DeWitt and Stonebraker’s 2008 “major step backwards” polemic was widely mocked and almost entirely correct. Within five years the Hadoop ecosystem had rebuilt SQL on top of itself (Hive, then Impala, Presto, SparkSQL), and Google itself moved on to systems with schemas and SQL front-ends. The lasting residue was not the programming model but the architecture: shared-nothing scale-out and the separation of storage from compute, both absorbed into mainstream warehouses.

Lap five: NoSQL, and the decade of rediscovering ACID

NoSQL (circa 2007–2012: Dynamo, Bigtable, Cassandra, MongoDB) traded consistency and joins for availability and developer velocity, and it had a genuinely new constraint to point at — planet-scale services where a single-node RDBMS physically could not serve the write volume. But watch the trajectory of the flagship. MongoDB shipped in 2009 with no joins, no transactions, and no schema. It added an aggregation pipeline (a query language) in 2012, $lookup (a left outer join) in 2015, multi-document ACID transactions in 4.0 (2018), and schema validation along the way. Google’s own answer to Bigtable’s limitations was Spanner — globally distributed and serializable and, by 2017, speaking SQL. The rebels did not lose; they converged. Stonebraker and Pavlo’s scoring is blunt: the lasting contributions of the NoSQL era are better horizontal scalability and friendlier JSON ergonomics — both of which the relational incumbents then implemented.

The mechanics of reabsorption

Why does the cycle close the same way every time? Because the relational stack has three extension sockets that are cheaper to use than a migration: the type system (SQL grows a type: XML, JSON, vector), the access-method layer (the engine grows an index: GiST, GIN, R-trees, HNSW), and the optimizer rule set (new operators get costed and join the plan search). A rebellion, to win outright, must rebuild all the unglamorous machinery around its one good idea — recovery, concurrency control, statistics, drivers, backup, a planner — which takes roughly a decade. The incumbent only has to bolt the one good idea into a socket, which takes roughly two release cycles. The asymmetry is structural, not sentimental. The corollary the 2024 paper draws: never bet against SQL absorbing a feature; bet only on whether the feature was real.

Field noteIn 2003 I watched a Fortune-500 insurer commit to a “native XML” policy store because “our documents will never fit in tables.” By 2009 the same team was migrating to DB2’s pureXML — the identical documents, now a column. The total code change in the actuarial apps: connection strings and namespaces. The rebellion’s entire value had been one type and one index all along; everything else was an expensive way to lose point-in-time recovery for six years.

Rebellion (peak)	The real pain	What it discarded	How SQL absorbed it	Residue today
Hierarchical / CODASYL (1968–80)	— (incumbent)	data independence	n/a — relational won outright	IMS still runs banks; lesson, not feature
OODB (1988–98)	object–table impedance mismatch	declarative queries, schema boundary	SQL:1999 structured types; ORMs above the line	persistence libraries, not databases
XML (1998–2008)	semi-structured, document-shaped data	simplicity; closed algebra over flat relations	SQL/XML:2003 type + XML indexes	an embarrassing column type
MapReduce (2004–14)	petabyte scale on commodity nodes	schemas, indexes, the optimizer	SQL-on-Hadoop; scale-out absorbed by warehouses	shared-nothing architecture, storage/compute split
NoSQL (2007–15)	planet-scale writes, dev velocity	ACID, joins, SQL	JSON in SQL:2016/2023; Spanner-class distributed SQL	horizontal scalability as table stakes
Graph (2010–)	multi-hop traversal ergonomics	(less — mostly a language claim)	SQL:2023 SQL/PGQ property-graph queries	being absorbed in real time

Every rebellion thought it was killing SQL. Every rebellion ended as a column type and an access method.

Lecture 2 · Thursday

Final Debate: the First Genuine Exception?

Resolved: vectors, semantic operators, and agent memory are the next XML — one type and one index away from being a feature. Two teams, twenty minutes a side, and the rule of this room is that you must argue the strongest version of the position you draw, not the strawman. The stakes are not academic. If absorption wins, the right career move is to go work on Postgres extensions and optimizer rules. If exception wins, somebody in this room should be writing the System R paper of the agentic era. I will give you both steelmen now, in full, and then referee. You have read fourteen weeks of evidence; today you have to weigh it.

The case for absorption: we have seen this exact movie

Team Absorption opens with the table from Tuesday and three exhibits. Exhibit one: pgvector. An embedding is a vector(1536) column; similarity search is an HNSW or IVFFlat index; pgvector 0.5 shipped HNSW in 2023 and within a year every major engine — Oracle, SQL Server, MySQL, DuckDB, the cloud warehouses — had a vector type and an ANN index. That is the XML choreography executed in eighteen months instead of ten years, because the incumbents have done the dance five times. The dedicated vector databases are already pivoting to “AI data platforms,” which is what a rebellion sounds like when it sees the column type coming. Exhibit two: SQL/PGQ. The graph rebellion — a real data-model claim with a real query-language gap — got absorbed into the 2023 standard while the debate about it was still running. Exhibit three: semantic operators are just expensive UDFs. A semantic filter is WHERE llm_judge(review, 'is this about battery life?') — a scalar function with a weird cost. The optimizer framework has handled expensive predicates since Hellerstein’s work in the early 1990s: cost it, reorder around it, cache it. Nothing in the architecture diagram changes:

-- 2026, one engine, one statement: three "dead" rebellions as features
SELECT p.sku, p.meta->>'title' AS title,
       embedding <=> :query_vec AS dist
FROM   products p
WHERE  p.meta @> '{"status":"active"}'          -- the JSON rebellion
ORDER  BY embedding <=> :query_vec               -- the vector rebellion
LIMIT  20;                                       -- HNSW index scan, costed like any other

Absorption’s closing line writes itself: the client changed, the workload changed, and the system answered the way it always answers — a type, an index, an optimizer rule. Agents speaking SQL through MCP are the strongest possible evidence that SQL remains the interface; we taught the new client the old language rather than the reverse.

The case for exception: three things no previous rebellion had

Team Exception concedes every exhibit and then points at three properties that no prior rebellion possessed, because every prior rebellion changed the shape of the data while keeping the semantics of a query: an exact answer, computed at a cost measured in milliseconds, by a deterministic operator.

First: accuracy is now a dimension of the result, not the plan. A B-tree lookup and a hash join return the same tuples; plans differ only in cost. An HNSW scan returns recall@10 ≈ 0.95 at one setting of ef_search and ≈ 0.99 at another, at 3–5× the latency — the answer itself is now a knob. XML never did this. A query interface where the result carries an implicit confidence interval is not a new type; it is a new contract, and we have no standard way to declare, propagate, or bill for recall. SQL has no syntax for “95% of the right rows is fine.”

Second: cost-per-operator is denominated in dollars, not milliseconds. Run the arithmetic from Week 9 again. A conventional predicate evaluation costs on the order of 10 ns of CPU — call it 10⁻¹² dollars. A semantic predicate at ~300 input tokens per row against a mid-tier model at $0.25 per million tokens costs ~$7.5×10⁻⁵ per row: seven to eight orders of magnitude more, before you price the latency. A semantic join over a 10⁴ × 10⁴ pair space is 10⁸ model calls ≈ $7,500 for one query — so plan choice is now a procurement decision. When the gap between a good plan and a bad plan is a factor of 10⁸ in money, the optimizer is not “handling an expensive predicate”; it is doing combinatorial cost avoidance under a budget constraint, which is a different optimization problem with different APIs (budgets, previews, partial results).

Third: non-determinism breaks the closed algebra. The relational algebra’s superpower is closure plus equivalence: every operator maps relations to relations, and the optimizer’s rewrite rules (push selections, reorder joins) are sound — they change cost, never answers. A semantic operator maps (relation, prompt, model-version, temperature) to a distribution over relations. Push a cheap correlated predicate below a semantic filter and you change what the model sees, hence what it returns: the rewrite changes the answer, not just the cost. Every rule in the optimizer’s rulebook now needs an accuracy-preservation proof obligation, and most don’t have one. No previous rebellion touched the soundness of the rewrite rules. That is the strongest version of the exception claim: not “new data,” but “the algebra is no longer closed, and closure was the whole trick.”

Field noteA team I advised last year cached semantic-filter verdicts by (row-hash, prompt-hash) and saw the cache go stale in the worst possible way: a silent model-version bump changed 4% of verdicts, which flipped a downstream aggregate that a procurement agent was acting on. No relational system in fifty years had the problem “the index disagrees with itself across Tuesdays.” They now pin model versions like schema versions and treat a model upgrade as a migration, with a backfill. Remember Week 11: that is derived-belief isolation showing up in production before it has a name in any textbook.

Refereeing: what would settle it

Here is the falsifiable framing I want you to defend in the post-debate write-up. Absorption is winning wherever the new thing can be made deterministic and exact at the storage layer: vectors-as-data are absorbed, full stop. Exception is strongest at the operator and contract layer: nobody has shipped a credible cost-based optimizer whose cost function is (dollars, latency, expected accuracy) and whose rewrite rules carry accuracy proofs. Watch three tripwires over the next five years. If SQL grows syntax for accuracy contracts (something like WITH RECALL 0.95 CONFIDENCE 0.9) and engines honor it, that is absorption — the standard will have grown a clause, its biggest concession since windows in SQL:2003. If instead agent traffic migrates to a different contract entirely — conversational, budgeted, probabilistic, with the database returning beliefs plus provenance rather than tuples — that is the exception, and the relational era will have ended the way the navigational one did: still running underneath, no longer the interface. And if neither happens, the cycle just turns again and some of you will teach lap eight.

The course thesis, restated, and what to work on

Fourteen weeks ago the syllabus made one claim: agents are now the dominant database clients; this changes workloads, not physics. You can now unpack it. The physics held: B-trees, ARIES, MVCC, the buffer pool, and the cost-based optimizer survived contact with a client that issues 400 schema-introspection queries a minute and retries at machine speed. The workloads changed everywhere we measured: read/write ratios, statement diversity, retry amplification, context-window-shaped result limits, embeddings as the universal access path. And one thing turned out to be genuinely scarce — not throughput, not storage, but agreement on meaning: between the schema and the prompt, between the model’s belief and the committed state, between two agents reading the same table and acting on different summaries of it. Databases spent fifty years manufacturing agreement on values; the open frontier is manufacturing agreement on meaning at the same standard of rigor.

So, concretely, three problems worth your next five years. (1) The agent-native TPC. TPC-C measures transactions per minute; nothing measures correct-task-completions per dollar under retry storms, introspection floods, and mixed exact/semantic plans. Whoever writes the benchmark sets the field’s incentives for a decade — that has been true since Gray’s DebitCredit. (2) Derived-belief isolation. When an agent acts on a summary, an embedding, or a cached verdict derived from state that has since changed, what is the isolation level of that read? We have names for dirty reads; we have nothing for stale beliefs. Define the anomaly taxonomy, then the protocols. (3) Recall-aware compaction. Agent memory grows without bound; compaction (summarization, forgetting) is lossy by design. We need compaction policies with stated recall guarantees — “after compaction, queries about entities mentioned ≥ k times lose at most ε recall” — and storage engines that enforce them the way ARIES enforces durability. All three are systems problems with clean evaluation stories. None requires inventing a new model architecture. That is on purpose: the lesson of fifty years is that the durable contributions are contracts and machinery, not models.

Fig. 14.1 — The reabsorption cycle, per Stonebraker & Pavlo (2024). Five rebellions have completed the lap; the dashed exit has never been taken. Thursday’s debate is about whether semantic operators and agent memory leave through it — or just lap faster.

Readings

Read Before Thursday

What Goes Around Comes Around… And Around… — M. Stonebraker & A. Pavlo, SIGMOD Record 53(2), 2024.The lecture’s backbone. Focus on the scoring of NoSQL and graph, and on which “lessons” the authors say never change — you will cite both sides of it in the debate.

The Seattle Report on Database Research — D. Abadi et al., CACM 65(8), 2022.The field’s last pre-agentic self-portrait. Focus on what the community ranked urgent in 2022 versus what this course argued matters now — the gaps are your debate ammunition.

What Goes Around Comes Around — M. Stonebraker & J. Hellerstein, in Readings in Database Systems, 4th ed., 2005.The original. Skim for the XML chapter only, then compare its 2005 predictions against the 2024 scorecard: a rare controlled experiment in technological forecasting.

Exercises

This Week’s Problems

Exercise 14.1 · warm-up

Run MongoDB through the reabsorption template from Lecture 1. For each year 2009 → 2018, list the relational feature it re-added (query language, join, schema validation, multi-document ACID transactions) with the release that shipped it, and the corresponding feature SQL absorbed in the opposite direction (JSON operators in SQL:2016, the JSON type in SQL:2023). One page. Conclude with a single sentence: did the document rebellion win, lose, or converge — and by Tuesday’s definitions, is there a difference?

Exercise 14.2 · core

A table reviews has N = 1,000,000 rows. Query: semantic filter llm_judge(text, 'mentions battery degradation') AND conventional predicate product_id = 47 (selectivity 2%). Assume 300 input tokens per row, $0.25 per million input tokens, and a judge with precision 0.93 / recall 0.90 measured on a 1k-row labeled sample.

(a) Compute the dollar cost of both predicate orderings. (b) Now suppose the judge’s recall on product 47’s reviews specifically is 0.78 (its battery complaints are phrased unusually). Show that the cheap ordering and the expensive ordering return the same rows here, but construct a variant query — a semantic join or a prompt that references aggregate context — where reordering changes the result set, not just the cost. (c) Propose an optimizer rule format that carries both a cost term and an accuracy-preservation precondition, and state the precondition for selection-pushdown across a semantic filter precisely.

Exercise 14.3 · stretch

Design the agent-native TPC. Specify: (i) the workload mix — schema-introspection storms, retry amplification with idempotency-key reuse, mixed exact/semantic plans, and at least one multi-agent scenario where two agents act on derived beliefs of the same base table; (ii) the primary metric — defend a choice such as correct-task-completions per dollar at a stated recall floor, and define “correct” operationally, including who labels ground truth and at what cost; (iii) the isolation clause — a checkable criterion for derived-belief staleness violations, analogous to TPC-C’s consistency conditions. Then attack your own design: give two ways a vendor could game the benchmark (history says they will — see clustered TPC-C controversies) and amend the spec to close one of them. 4–6 pages. This is an open research problem; a strong answer here is a workshop paper, and I mean that literally — the best submission from the last cohort is now under review.

❦