If you model data, how do you approach it?

Jan 10, 2024

Trying to get a read on what you are all doing with data modeling. Two questions, and feel free to answer one or the other, or both.

If you work somewhere, do you model data?

If so, how do you approach data modeling?

Thanks,

Joe

101 Comments

John Gilmore

Jan 11, 2024

Before I even think about a data solution, I must first understand the data from a business perspective.

This means building a single conceptual data model; that is, a business-focused model of the data.

From here, I can start thinking about data solution design. In effect, this means building logical data models, which typically differ from each other depending on their purpose.

Think about it: a logical model for a dimensional solution will be very different to that for an OLTP solution. Yet they will both reference the same conceptual model.

Again, when we consider implementation, we might have multiple physical models for a particular logical model. These typically differ depending on the chosen platform. Think Oracle versus SQL Server for example.

So I end up with a hierarchy of models, each with a distinct purpose. There will only be one conceptual model, but it will have one or more logical models, each of which will have one or more physical models.

My feeling is that too many people jump straight in at the logical level without even thinking about the conceptual. What they’re doing is design, not modelling.

Expand full comment

Reply (2)

Joe Reis

Jan 12, 2024

"My feeling is that too many people jump straight in at the logical level without even thinking about the conceptual. What they’re doing is design, not modelling."

Truth bomb.

I also see people jump straight to physical modeling, ignoring the conceptual and logical steps. You can imagine what happens from there.

Expand full comment

Robert Sanderson

Jan 11, 2024

> What they're doing is design, not modelling.

I need this on a t-shirt! So true, and so well put.

Expand full comment

Phil Maddocks

Jan 11, 2024

I do model data.

We typically look towards the data vault methodology to combine data from multiple sources around business domains. We'll speak to the business to understand how they see the data which helps us understand what our hubs may be and how they link to one another (John Giles book Elephant in the Fridge is a fantastic read on this approach).

From there we'll create a high level model connecting these hubs to each other using links. Then we'll look at the source data and look to bottom fill this top down designed model. On top of that warehouse, we'll build reports, fact/dims, and OBTs if required.

Expand full comment

Reply (4)

Joe Reis

Jan 12, 2024

John's books are fantastic! Was actually just chatting with him the other day. Wonderful bloke.

Expand full comment

Juha Korpela

Jan 11, 2024

Wholeheartedly agreed with John Giles' book! Probably the most practical book on Data Vault, and it's a fun read!

Expand full comment

Mohamed Souare

Jan 11, 2024

We do follow exactly the same pattern. I would say that data is first modelled in a first layer using 3NF before getting decomposed using the value method.

Expand full comment

Daniel Rothamel

Jan 11, 2024

Yes! Elephant in the Fridge is fantastic!

Expand full comment

Donald Parish

Jan 11, 2024Edited

Star Schema aka dimensional modeling using the Microsoft Power BI stack. Didn't actually study Kimball until about 10 years after starting. Wish I had, since I was missing Slowly Changing Dimensions (SCD) and Accumulating Snapshot Fact tables. I certainly drink the Kimball Kool-Aid. Found this interview last night: https://archive.org/details/sim_dbms_1994-07_7_8/sim_dbms_1994-07_7_8

Expand full comment

Practical Data Modeling

If you model data, how do you approach it?