22 Comments
author

I’m wondering if this deserves a different term than “data model”? The risk of using data model is that it overloads and already well worn term. Thoughts?

Expand full comment

Getting more and more excited about the book 🫶 I really like the agnostic nature of the definition between operational and analytical use cases. I think this has been missing.

On "a structured representation that organizes ... data" I also see "a representation of structure" and how it connects to the idea in systems theory and systems thinking that the structure of a system largely determines its behavior (e.g., Conway's law).

For me, this makes the MMA idea especially powerful. There are different archetypes for structuring data and data systems with inherent tradeoffs. Instead of fighting about which modeling approach is best, I find it extremely useful to understand which archetypes exist, their tradeoffs and how they affect the behaviors of the system so that we can choose the one best suited for each particular use case.

Expand full comment

My 2 cents : when people talk about data modeling, they immediately think "gold layer". For me, data modeling begins with the organization of directories/topics in the raw layer. And I see too little literature on this subject. I would love to read what you think about that.

Expand full comment
Jan 21Liked by Joe Reis

I love everything about this definition, Joe...well done!

It encompasses both operational and analytical, accommodates both humans and machines, and allows the flexibility to choose the appropriate technique for each situation. I really think this definition is elegant in both its simplicity and its nuance.

Expand full comment
Feb 7Liked by Joe Reis

Thank you for the well thought out definition, it adds value to data modeling approach for considering both humans and machines. I'm current working on a model for a business domain and here is my layered approach.

Conceptual layer - We are apply DDD techniques to understand business process and identifying data products at conceptual level per domain.

Logical layer - We breakdown conceptual data product further by identifying key entities, data elements with definitions and relationship between the entities within data product boundary. Design patterns like dimensional etc. are applied here.

Physical layer - This is implementation of the logical and focusses on efficient usage of storage, compute, security and extensibility. This layer is key to experience and also should support agility.

In my view, all the three layers serve a purpose to humans to help them understand and use data, physical layer support machine consumption. I would like to get your thoughts on my approach.

Expand full comment

I need examples on how data model:

1) enable and guide human and machine behavior;

2) inform decision-making;

3) facilitate actions.

I asked ChatGPT 3.5, but the results do not make sense to me.

I moved the question here.

Expand full comment

What's your timeline for the book, Joe?

Expand full comment

MMA 😂 i love it! I didn’t have machines in my definition, but it makes perfect sense.

Do you see training for data modeling a mostly analytics focus even though, as you said, it’s not constrained to it.

Application data models are different, but they’re still models. And data flows between models as applications --> warehouses --> ML models --> AI systems, these days.

As data engineers (speaking for myself) I’m loving learning backend engineering both to support apps and MLOps. The domain and tactical decisions differ, but the modeling practice seems mostly the same, no?

Expand full comment

Hey Joe, thanks for starting this conversation. As someone who comes from a non-data background, has spent only ~5 years in the data space, and has become extremely passionate about simplifying anything that can be simplified (most things can), I love it when folks come together to discuss and define terminology.

Also, incidentally, I just finished chapter 4 of my book which happens to be titled "Deconstructing Data Modeling" and I would love to share my definition here as well:

👉 "An analytical data model represents a set of rules that, once applied to one or more tables in a database, creates a modified view of an existing table or a new table altogether."

I mention "analytical" because I'm also covering "application data modeling" which precedes "analytical data modeling."

I thought it made more sense to strip the purpose from the definition which as as follows:

👉 The purpose of an analytical data model is to transform data without modifying the source data. When a model is executed, the resulting table is easier for humans to read and analyze, as well as for machines to interpret and process.

Does this make sense to you? Do keep in mind that I'm writing for semi-technical folks, many of whom are totally new to the data space. Hence, I want to keep things as simple as I can.

Expand full comment