Might AI be a massive help to data modeling? I think so. In this section, I go over ways to help data modeling move faster, including the use of AI.
Next up, I’ll cover data modeling patterns. Hopefully dropping this week are chapters on graphs and time.
Thanks,
Joe
Ellie makes data modeling as easy as sketching on a whiteboard—so even business stakeholders can contribute effortlessly. By skipping redraws, rework, and forgotten context, and by keeping all dependencies in sync, teams report saving up to 78% of modeling time.
Bridge reality with Data! Read more here.
Thanks to Ellie.ai for sponsoring this post.
The software industry is similar to the diet industry. Both are notorious for people and vendors promising quick wins with little effort. As I write this, AI is the latest cure-all, with promises of rapid productivity gains, AI agents collaborating to run your business, AI software and data teams, and more. We’ll see how this plays out. Regardless, the current crop of AI tools are awesome, and I’m continually impressed with how much faster I can move with AI. When used right, AI doesn’t solve every problem, but it can help you move dramatically faster in your data modeling work.
For me, AI is helpful for two primary purposes: creating new content (synthesis) and condensing existing information (summarization). Synthesis involves generating various forms of media, such as text, video, images, or audio, while summarization refers to AI's ability to process and condense input. I haven’t found AI to be good at understanding broad contexts or domains, but I imagine it will improve as models become larger and more sophisticated. For now, your knowledge of the business’s nuances will be the main complement that AI can’t handle.
Here are some ways I’ve found AI to help in my data modeling (and work in general).
Parsing through text and extracting entities, relationships, and attributes. It’s not perfect, but this saves a lot of time and provides a good starting point.
Recording and summarizing discussions. For video calls, I often record the call and use AI to summarize and highlight key points for follow-up or action items.
Text summarization. I read numerous papers and articles. This is often a mixed bag. Nowadays, I use AI to summarize a paper or article before I read it, allowing me to curate what seems useful versus what might be a waste of time. Also, when I read a paper or article, sometimes I’m so fixated on details that I forget the big picture. Again, AI summarization is extremely helpful here.
Pattern recognition across various documents and images. I have AI inspect folders, documents, photos, and other files. It does an amazing job at comparing and contrasting what’s contained in various types of data formats.
Coding co-pilots and agents. For boilerplate projects and unfamiliar codebases. AI does an amazing job at writing and reviewing code.
Database fields and metadata. Either introspect a database’s information or generate comments and descriptions if they’re missing.
Diagram creation and review. Same as with code. The AI does a passable job of creating and reviewing data model diagrams.
I’m sure I can go on, but you get the idea. In all of these real-world scenarios, AI shaves hours, days, or even weeks of time off my projects. Is it perfect? No. But it’s pretty darn good. I’ll take fixable imperfections and overall boost in productivity and accuracy over slogging it out the old-fashioned way.
I suggest applying AI to all levels of data modeling, especially in conceptual data modeling (CDM), which can be particularly time-intensive. A common complaint of CDM is that it’s very time-intensive and laborious. Rounding up stakeholders can be a major effort, especially if you’re at a large company where stakeholders are pulled in a million different directions.
Record stakeholder interviews and use AI to identify the core components of the data model from the conversation. It does a surprisingly decent job, and I encourage you to try it out yourself. For example, write or record a mock conversation and put that conversation into an LLM. You’ll probably be surprised at how good it is at parsing out entities, relationships, and attributes. Have AI create a conceptual diagram. Then, take it a step further and have it generate the logical and physical models, as well as the SQL code, if you’re modeling for a database. Within minutes, you’ll have a working prototype of a data model for you to review and refine. Again, it’s not perfect, but you’ll most certainly save time that you can focus on thinking through the data model and collaborating with stakeholders to ensure the model represents the business you’re trying to model.
Some people in data modeling are opposed to using AI, particularly with hallucinations. This is where it helps to pay attention and not let AI do all of the work. Like any technology, AI is a tool. But it’s not a substitute for taking your craft seriously.
Regardless of the hesitations and opposition, AI is here to stay, so be pragmatic. Understand the limits of AI. AI is a probabilistic machine. Humans are, too. Even given the shortcomings of AI, the benefits far outweigh the downsides. As long as you know to trust but verify, and the basics of how LLMs work, treat it as you would any other tool. Understand its utility and limitations, and use the tool where it fits your situation.
It’s still very early days with this latest incarnation of AI. Currently, AI models are improving every week. Whatever I write here is subject to change, especially in discussions of specific AI capabilities. In the span I’ve been writing this book, the conversation has evolved from LLMs and AI to agentic AI and vibe coding. There’s a lot of interest in using AI for data modeling, and this is fresh territory you can explore and contribute to. Data professionals have a once-in-a-generation opportunity to steer their craft and unlock possibly unbelievable efficiencies and approaches to data modeling.
What AI tools do you use for recordings and diagrams?
Interesting piece on AI’s role in accelerating data modeling workflows.
Curious if you plan to dig deeper into the underlying techniques—particularly around Named Entity Recognition (NER)?
I’d love to hear your thoughts on the use of domain-specific NER models (like those from John Snow Labs, etc.) in extracting and structuring business concepts from stakeholder conversations and documentation. This feels like a powerful unlock for bridging unstructured inputs with conceptual and logical data models—especially in complex enterprise contexts.
Would be great to see a follow-up piece that unpacks this layer a bit more.