Data Modeling is Dead (Again), 2026 Edition. Part 1

It's AI's Fault!

Dec 23, 2025

GTA San Andreas "Ah shit, here we go again." | DATA MODELING IS DEAD | image tagged in gta san andreas ah shit here we go again | made w/ Imgflip meme maker

It’s that time of the hype cycle again, where databases, data modeling, SQL, and other mainstays are written off for dead. I’ve seen this happen several times in my career, and we’re here once again. Since this Substack focuses on data modeling, I’ll focus on this for now, though the same arguments apply to whatever else this hype cycle’s grim reaper will supposedly kill off.

The death of data modeling isn’t a new phenomenon. We’ve heard “data modeling is dead” many times. In recent memory, we have the NoSQL era, the “Big Data” hype, and the early ML days. Today it’s AI, LLM, agents, etc. The argument is always that new technologies abstract away the details and “grunt work,” making data “too easy” to work with. Who cares about the database or the underlying model? They’re both obsolete, right?

Because there’s a lot of ground to cover, I’m breaking this into a series of articles. In the first part of this series, let’s look at some of the common arguments I’ve seen about why data modeling is dead. In the next part, I’ll give some counterarguments. Then we’ll see how this is received and if a third part is warranted.

Common Arguments for Why Data Modeling is Dead (2026 AI Edition)

The “Context Window is the New Schema” Argument

With context windows constantly expanding, the need to curate, model, and retrieve specific rows is vanishing (incidentally, the same argument suggests that RAG is also dead). You don’t need to model data into a Star Schema or normalize it to save space or optimize retrieval. You can simply dump massive, messy JSON blobs or raw documents into the context window and let the model figure out the relationships “in-memory” for that specific query. This is effectively “Schema-on-Read 2.0,” where the “schema” is generated ad hoc by the LLM’s attention mechanism, rendering static database modeling redundant.

The “One Big Table” (OBT) Argument

I recently joked that OBT is a psyop from Big Consulting to get more data cleanup work. There are plenty of arguments for and against OBT as a data model, which I’ll cover in a separate article. Here, we’ll focus on why OBT is the default. First, it’s easy. Literally dump your data into ONE BIG TABLE (I also call this WTF - Wide, Tall, and Full). Second, in the context (pun intended) of AI, LLMs are historically bad at writing complex SQL with multiple joins (prone to hallucinations and “fan traps”).

To make data “AI-ready,” we should abandon normalization, dimensional modeling, data vault, etc., in favor of One Big Table (OBT). A single, massive, denormalized table is easier for an LLM to query because it requires SELECT ..WHERE...logic rather than understanding foreign key relationships, because “joins are bad.” This pushes the industry back toward flat files and massive redundancy, framing normalization as an “anti-pattern” for LLMs. Nevermind that Codd created normalization as a way to reduce/eliminate redundancy, but that’s old school and doesn’t matter anymore because we have amazing query engines and AI.

The “Just-in-Time” (JIT) Modeling Argument

Agentic AI can now inspect raw data sources and infer structure on the fly. So, we no longer need humans to pre-define the data model or semantic layer. An agent can look at a raw database, infer the relationships, generate a working semantic model, execute the query, and then discard the model. Here, data modeling is a runtime task for agents, not a design-time task for humans. It argues that maintaining a static semantic layer is technical debt. Everything is fluid, including the data model itself.

The “Synthetic Data” Loop

We’ve run out of publicly available data to train on, and future AI models will be mainly trained on synthetic data generated by other models. If the data producer and the data consumer are both AI, the intermediate structure (the data model) designed for human understanding is inefficient. AI systems might evolve their own specialized binary formats or high-dimensional vector representations to talk to each other, bypassing human-readable tables entirely. Here, “tables” are just a UI for humans, and as humans step out of the loop, tables (and their models) become unnecessary.

The “Tabular Data is Dead” Argument

Since 80-90% of enterprise data is unstructured (docs, PDFs, Slack messages, emails), and that is where the “real” untapped business value lies, tabular data is “dead.” Traditional data modeling is obsessed with the 10% of structured data (transactions and analytics). Since AI unlocks the other 90% of data, data modeling is a niche skill set for cavemen. At the same time, the bulk of the attention shifts entirely to vector-based retrieval,l which requires chunking, not modeling. Incidentally, I remember this argument during the deep learning hype cycle in the last decade.

The “No Need to Learn Anything…Because AI” Argument

The argument goes that higher levels of abstraction mean that learning the lower levels is a waste of time. This is an argument as old as the tech industry, and the primary historical argument is that computing history is a history of abstractions. We no longer hand-code in Assembly because compilers and higher-level languages (C, Python) handle memory management and machine code instructions. Common rebuttals include, “AI is amazing at writing code, so don’t learn it.”

The other day, I saw a LinkedIn post saying that because AI-generated text-to-SQL is so good, learning SQL is a waste of time (the argument was so idiotic that I didn’t want to give the author any attention). I’ve seen similar arguments about learning programming in general, because “The hottest programming language is English.” And obviously, AI will soon understand your entire business, so it will just model the data for you and implement it in SQL. So, data modeling or hand-writing code is like using a calculator to replace manual long division for speed. I’ve seen similar arguments about databases, indicating they’re obsolete because “everything will be a vector, or we’ll just throw MCP on top of all source systems.

So is data modeling dead, once again?

If you take all these arguments at face value, the conclusion is obvious - immediately stop data modeling and stop learning anything new. Just let AI do everything. Dump all of your data into a context window or a vector store, and let the agents figure it out. This sounds pretty badass, like having a humanoid robot cook my meals. It’s also a trap.

Finally, these are just some of the arguments I remember seeing. There are many opinions about how AI affects the tech and data industries, so you might have seen some that I haven’t included. Drop them in the comments.

In the following article, I’m going to dismantle these arguments one by one and explain why AI actually increases the need for rigorous data modeling, rather than replacing it.

Stefano Barzaghi

Some of these arguments are intellectually interesting, but I struggle to see strong real-world evidence behind them.

In practice, meaningful results with AI still seem to depend on people who understand data modeling, semantics, and system constraints. AI amplifies that knowledge; it doesn’t replace it.

I’d genuinely be curious to see concrete examples where abandoning modeling entirely has produced better, more reliable outcomes at scale.

Expand full comment

1 reply by Joe Reis

Sloan Russett

It’s interesting that this xyz is dead message follows such a cause and effect path. “Because of this new tech xyz now must die”. It’s like the people crafting these arguments never learned that coexistence and compliments are a thing.

It’s now just clickbait for likes. Just don’t say excel is dead, every business leader director level and up will have a heart attack. “Where’s my PowerPoint of excels?”

6 more comments...

Practical Data Modeling

Discussion about this post

Ready for more?