Are the Levels of Data Modeling Outdated?
Levels of Data Modeling - Criticisms and Considerations Section
I’m posting some common criticisms and considerations of the Levels of Data Modeling that I often hear. You’ll see these pop up throughout the next week or two as I write some of the book's other sections. These criticisms and considerations will appear in the final section of the Levels chapter.
If you have any other criticisms or considerations of data modeling you’d like me to address, post in the comments or message me. Assuming it’s a good fit for the book or an article, I’m happy to address them.
Thanks,
Joe
The different levels of data modeling were designed as separate concerns. When they were created in the 1960s/70s, the world was a much different place. Things moved much more slowly, development occurred in a waterfall-style delivery, and technology was rudimentary and table-centric. The world has considerably changed since then. Agile has all but replaced waterfall, and development is done iteratively. Innovation and building occur at warp speed, and will likely see a considerable speedup due to AI and agentic workflows. There seem to be as many data delivery and storage systems as stars in our galaxy. In today’s fast-moving world of agile teams, streaming data, LLMs, and polyglot persistence, it’s fair to ask: Do we still need all three levels?
Some argue the levels are irrelevant because it’s easy enough to create SQL, events, JSON, ML models, etc. With powerful LLMs and agentic tools, many ask: Why bother with upfront modeling at all? Why not just describe what you want in a prompt and let the AI generate the schema or data pipeline directly? Interviewing stakeholders and getting “shared understanding” to create a conceptual model is nice and quaint, but unrealistic in a business environment where shipping is highly valued.
Many developers I meet don’t think in terms of “logical” models. The line between logical and physical is blurred with NoSQL, event streams, and lakehouses. You might whiteboard a JSON payload, but it’s just as easy to write a struct and get things going in your streaming pipeline. Need to change your payload schema? No big deal, since schema evolution is baked into your architecture.
Despite the shift toward speed and flexibility, the levels of data modeling haven’t disappeared. When you look at it from this context, it’s almost impossible to escape the levels of data modeling. They’re just done with varying degrees of intention. Conceptual thinking still happens. It might live in someone’s head, not a shared diagram. Just because the conceptual modeling level was “skipped” doesn’t mean it wasn’t done. It wasn’t intentionally done in collaboration with stakeholders, with a diagram or documentation as an artifact. But conceptual modeling still happened. Logical structure still matters, even if it’s implicit in code. Physical modeling deals with performance, storage, and platform-specific optimization. People practice the levels, but perhaps not in order or to the degree some might preach.
The levels of data modeling aren’t outdated, but we need to update our approach to using them. Rather than treat the levels as a mandatory sequence, data modelers should treat them as tools in a toolkit. You may start with a physical model (e.g., reverse-engineering existing tables), then layer on conceptual understanding later. You might sketch a quick conceptual and logical model to align stakeholders in greenfield projects. This also helps reduce rework. You might build a conceptual and logical model of features and signals for AI and ML use cases, not just entities and relationships. Again, the world has evolved past the dogmatic, slow-moving table-centric practices of the past.
Given your goals and constraints, the key is intentionally applying the levels to your use case.
Sadly, what you describe is quite normal these days. Table sprawl galore
They definitely aren't dead. If you don't do some form of modeling, you will end up with a random assortment of tables or repeating data elements across tables.