45 Comments
User's avatar
Abhinav Goyal's avatar

I wouldn't call it geography-based, I would call it culture-based. There is the fast food, "Bias for Thoughtless Action" culture that looks only at what's good in the 6 month horizon. Who cares about 2 years? We'll be in a different place by then! Then there are people looking to build long-term value. These two orgs might be right across the street from each other.

Of course, this is true unless regulatory/compliance standards step in. If they will need to pay for today's mistakes in 7 years, then they are forced to think/plan ahead.

Expand full comment
Joe Reis's avatar

I thought a lot about the culture part. How would you describe various data modeling cultures? Very interesting idea you're bringing up.

Expand full comment
Abhinav Goyal's avatar

Thanks, Joe. You're very kind. I can see why you have such a popular podcast. :)

As with all cultures, there are the religious fanatics who are happy to fight over ideologies( Inmon vs Kimball, for example) and then there are the practical ones that look at what's needed for the applications that the data need to support.

What do you think? You'd have much deeper insights and data on this than I do. :)

Expand full comment
Shagility's avatar

I have deffo seen the same thing as you.

Anecdotally Europe seems to have a much larger use of Data Vault modeling compared to the USA.

Which always intrigued me, given Data Vault was invented in the USA.

I asked Hans Hultgren that very question on my podcast episode with him:

https://agiledata.io/podcast/agiledata-podcast/the-patterns-of-data-vault-with-hans-hultgren/

I asked:

" If I look globally I would say that US is star, schema, or Kimball centric, the majority of the work done in the US is around dimensional. If I look at Europe, it’s heavily ensemble or data vault centric, Is that true?

Hans replied:

"Yeah. I gotta say surprised that is still true and I think that you’re right to observe that"

"Europe and Nordics pretty much lead that charge as far as a adopting these techniques. Definitely, Netherlands Sweden, probably top those charts and then all surrounding areas in Western Europe and Nordics seem to be doing quite a bit of it.

Expand full comment
Joe Reis's avatar

Matches almost perfectly with what I'm seeing. I joke that Data Vault is the David Hasselholf of the data world - made in America, popular in Europe.

Expand full comment
timo dechau 🕹🛠's avatar

Don't ruin my Hasselhoff childhood.

Expand full comment
Shagility's avatar

Nah Data Vault is George Clooney, damn sexy.

Expand full comment
Carlos fernando's avatar

In my experiences; i think that LatAm will be more aligned with US; but depend how university or market trends talk about the topics: i have friends in universities and working that don't hear ensemble data modeling; few about Data Vault; but more about dimensional and normalized approach.

My perspective is that they know about clasic Kimball and Inmmon data modeling approachs but not about OBT or unified start schema for example.

Expand full comment
Johnny Winter's avatar

I wonder if the UK aligns more with US? I haven't ever come across much DV usage in blighty. I've only really had exposure to Kimball, but we have a few European customers who seem interested in pursuing DV

Expand full comment
Jim Ryan's avatar

The general principles of data modeling should be the same regardless of geography and company. Procedures can differ but the theory behind it shouldnt

Expand full comment
Joe Reis's avatar

Agreed...and I wish this happened

Expand full comment
Johnny Winter's avatar

I think the realisation that there's a need to data model can be a maturity thing. Also, for young companies, does the shaping of their products and services make modelling too early a futile exercise as they're effectively building on sand. I'm hypothesising though. My career in data has always been as a hire into an existing data team, so the recognition that that skillset is required has already happened.

Expand full comment
Joe Reis's avatar

From what I've seen, young and immature companies are focused on delivering features and product. One could argue, especially if the company is tech focused, that data modeling is super critical at this early stage. It's also like arguing that a 5 year old should should start saving for retirement. The tech and data debt is far lower if modeling is done earlier, but that also takes precious time away from building the biz. So, hard to say. Tradeoffs all the way down.

Expand full comment
Annie Azarian's avatar

Does it really take precious time away? How precious is quantified? In the basis of short or long term? It doesn’t have to take too much of a time if the skilled people are hired.

Expand full comment
Ben Labenski's avatar

I think that the common denominator is culture, which could be correlated w/ both company size and geography.

Data teams - to me, are naturally behind. Let's take a startup as an example. At what point do they make their first data hire? Likely, that individual comes aboard well after data "exists", and right at the point where the business decides that it's worthwhile to start getting value out of the data (e.g. analytics).

... hopefully that first hire is a Data Engineer, but that's an entirely different can of worms...

This Data Team Of One will have their plate full right from the get-go, and likely have pressure to deliver insights yesterday. It seems like Data Modelling is commonly the first casualty of prioritizing velocity over accuracy / scalability. And it works! You can deliver insights at velocity, but what does "query driven data modelling" look like at scale?

I'll add though, that, one can make legitimate arguments for prioritizing velocity at a startup: let's say Data Team Of One estimates that it'll take 6-9 months to build a robust data model. Does the company even have that runway in their budget / funding?

So to come full circle:

- small company = smaller + more-behind data team = more likely to cut corners (ie data modelling) to deliver insights based on demands / velocity

- larger company = larger + less-behind data team (if they've caught up) = more likely to have realized at some point that query driven data modelling isn't sustainable, and - at some point, prioritized building a scalable data model

The geography aspect is interesting... the only thing that I could infer here would be that perhaps there are more startups in NA? So perhaps more of these smaller companies?

Expand full comment
Joe Reis's avatar

Very good insights Ben. I'll need to run the numbers of startups vs other countries, but my hunch is you're spot on. Anecdotally, even large companies in the US are struggling with data modeling in ways that European companies would laugh at. This might also be due to mature US companies being held to account on a quarterly basis, whereas Euro companies tend to move slower and have very tight labor laws. In the US, if you're not moving fast, you get fired. In the EU, good luck firing anyone. Thanks!

Expand full comment
Noorali Raeeji Yaneh Sari's avatar

Data modeling varies geographically and by company type due to regulatory, cultural, and operational differences. Geographically, compliance with data protection laws like GDPR and regional business practices shapes models. Culturally, variations in language, addressing, and date representations impact design. Operationally, diverse workflows influence how data is structured. The scale and complexity of a company, whether global or local, affect the granularity of data models. Technology infrastructure availability and strategic business goals further shape data modeling decisions. In essence, data modeling adapts to legal, cultural, operational, and strategic contexts unique to each geographic location and company type.

Expand full comment
Joe Reis's avatar

Agreed. The regulation part is interesting. Where I live (the US), we are implicitly subject to GDPR and the EU AI Act because ignoring it means potential penalties. Compliance and regulation cut across borders very easily, especially for bigger companies.

Great thoughts.

Expand full comment
Kalle Bylin's avatar

Working with companies in the Nordics where I live, I usually feel that there is more time to think (sometimes even too much 😄) and invest time into proper data modeling.

I also wonder, how does data modeling differ "within" companies around the world? For example, between platform/product/application teams and data teams. How is it similar?

I have worked with a couple of teams in Europe and South America that intentionally or not created a robust conceptual data model during initial product discovery/delivery without a data person which then could be extended for other purposes, features and applications.

Not saying this is how it should be, but compared to other cases I spent much less time trying to figure out how data from different source systems should be integrated and how the data lake / warehouse should be designed.

In general, I tend to find information about data modeling either for transactional or analytical use cases, but not so much on how data modeling is done well across these systems.

Expand full comment
Robert Sahlin's avatar

Would you say that the data layer in the source systems influence the data modeling in the analytical system? Ex, do you see a difference in techniques depending on internal systems built on RDBMS or NoSQL, monolith or micro services, a company growing through acquisition or organically (ie heterogeneous vs homogeneous sources)? I usually find the analytical system heavily influenced by the operational source systems, but my hypothesis is very biased by the few examples I have practical experience of

Expand full comment
Joe Reis's avatar

I've been thinking about this, sort of a Conway's Law for data modeling. Do the architecture and systems influence data models? I'm starting to think it inevitably does. And those systems are reflections of how companies communicate, per Conway's Law.

Expand full comment
Juha Korpela's avatar

This is an interesting point. One very specific example: I bet every European data person has had to deal with SAP. I'm not sure if it's that prevalent in the States? SAP has its own, extremely complicated internal data model, where all the tables are named after 5- or 6-letter abbreviations of German words and there's literally thousands of them... Dealing with that basically forces the data engineer (or analyst) to remodel the data in some way at least, because there's no way you can utilize that directly. Even SAPs own "data access layers" are insanely complex.

Expand full comment
Joe Reis's avatar

SAP's popular in the US too. If you're big enough, you're using SAP, Oracle, or MS for your erp

Expand full comment
timo dechau 🕹🛠's avatar

I definitely see this a lot - the sources are dominating the early versions of the model and it often then grows organically from there.

Expand full comment
Robert Sahlin's avatar

☝️ regarding type of company

Expand full comment
Annie Azarian's avatar

The enterprise conceptual data model should follow the Target Operating Model. Use the same terms and concepts identified in TOM. That way the data model can be measure and validated against TOM. Unfortunately, not many organisation develop TOM. Which is crazy!!! That’s the role of Business Architects. Again not many organisation have business architect.

Expand full comment
bryangoodrich's avatar

I work at a local municipal, so “government” in California.

Our small, and not that mature, data team does emphasize data modeling. Mostly because me (lead engineer) and my manager (previously lead DBA on infrastructure side) emphasize it.

Not sure where my experience fits, but figured I’d share 🤷‍♂️

Expand full comment
Joe Reis's avatar

Interesting. If you're young and immature as an organization, you could probably punt on data modeling. Why are you emphasizing it?

Expand full comment
bryangoodrich's avatar

The bureaucracy of government roles. We fear being audited, by leadership or external reviews. And we struggle to retain talent so having a standard approach enables easier maintenance and onboarding.

For instance, all my builds could be virtually automated like dbt style if we were a Python shop (had it done in Hadoop until we were audited). All my database objects have standards for backend objects, views, procedures and delivery to presentation layer. Now using SSDT database projects, SSIS, and CICD through Azure (Microsoft shop).

Some legacy databases were “built fast” by people no longer on our team, tons of data with inconsistent pipeline logic, data locations, cross database dependencies that aren’t always clear. And deployment usually meant the developer running stuff manually in prod, now very inconsistent with what’s in test 😂

Now yes, this involves more than just data modeling, and I think we could instill development practices without concern for the modeling efforts, but since they grew out of my standard Kimball architecture, it does seem that’s probably the real reason for the emphasis! 😖

Expand full comment
Pipeline to Insights's avatar

How about Australia ? what are your thoughts about AUS ? :)

Expand full comment
Martin Chesbrough's avatar

Australia has a history of “punching above its weight” in data modeling (I hope the phrase works, means performing better than expected). Way back in the annals of history Aussies like Clive Finkelstein and Graeme Simsion pioneered the idea of “model-driven development”. I would say in the 80s and 90s pretty well every major corporate in Australia was putting effort into data modeling.

I’m not the data vault expert but I see a fair bit of it around with corporates, perhaps 50% or more of corporates have done data vault at some point. Digital businesses and the mid market sector are more aligned to dimensional modeling as far as I have seen. Startups generally don’t have the resources, expertise or time to do any data modeling, other than building ML models off flat tables or document stores (it’s a form of implied data modeling).

There are a few really good data modellers in Australia - you know John Giles of course. I don’t know Shane Gibson but judging by his output I’d say he knows a thing or two.

Expand full comment
Joe Reis's avatar

I've seen similar. I think Australia is a strange Galapagos Island for data modeling. More innovative than the US, I think.

Expand full comment
Juha Korpela's avatar

And let's not forget New Zealand (where Shane lives I believe) - very good modelers in Kiwiland as well!

Expand full comment
Shagility's avatar

Yup New Zealand and Australia are a hot bed of Data Vault models compared to some other countries.

Expand full comment
Martin Chesbrough's avatar

Yeah did not mean to imply Shane was anything but a proud kiwi

Expand full comment
Adam Andrus's avatar

I have worked for a handful of companies that are a patchwork of acquisitions and mergers. You tend to face very different struggles when it comes to data modeling where you are having to try and reconcile all of the similar but different applications and data platforms all piled on top of each other. There are also a lot of politics involved with these situations based on which of the original companies "won" the merger... I have seen the scenario a few times where a better data model lost to something inferior or to the anarchy of "why bother with modeling" approach. I have not done much research to verify if this is valid, but I suspect the EU is more interventionist when it comes to mergers than the US which has been a free for all for the past decade or so.

Expand full comment
Curt Lansing's avatar

Just to add to your observations, in my many years of database development across many US companies, I have seen very little to no emphasis on data modeling. Personally, I've worked for both large and small companies and haven't noticed a difference in regard to data modeling emphasis.

Expand full comment
Chris Papenfuss's avatar

We definitely see a lot of customers do proper data modeling across Europe. Data Vault especially is very popular at the moment.

Expand full comment
timo dechau 🕹🛠's avatar

Some people mentioned regulatory here. At least from my experience I can't see impact on data models due to GDPR in the EU. Yes, we make sure, that on entry specific data gets filtered out, anonymised or tokenized, but this does not change how the whole model works.

Expand full comment
German Goni's avatar

My impression is that this is highly correlated with attention to regulatory standards

Expand full comment
Joe Reis's avatar

I see your point. I'd argue that regulations cut across borders (GDPR affects US companies), and data modeling is unarguably weaker in the US, compared with Europe.

What are you seeing where you live?

Expand full comment
German Goni's avatar

In absence of a local regulation, a lot or discussions about what standard to adhere to.

Expand full comment
Annie Nelson's avatar

Well apparently I need to do a better job at my pre-reads because reading this article makes me realize I dont have a good personal definition for data modeling. Where is the line between having and using a database, and 'doing data modeling'?

Expand full comment
Annie Nelson's avatar

... And its the title of just a few articles forward 👍

Expand full comment
Marton Horvath's avatar

In my xp it really comes down to the company context and skills in place. If you have a complex system landscape (like you mentioned SAP) or you are working with complex products / services (eg banking) you must model to break down and externalize complexity. In simpler environments you may keep everything in mind / code... etc. for a while. On the other hand, I also think that everyone does modelling (likely w/o erd diagrams) even in a simple excel file, but oftentimes they don't have the skillset / awareness about the topic to be effective (externalize by drawing), causing cognitive overload and stress in the long term without understanding the root cause.

Expand full comment