Highly recommend Lawrence Corr's Agile Data Warehouse Design book in case anyone hasn't read it already. It's very much a Just Enough Design Upfront (JEDUF) approach that encourages iteration.
Well said Joe, to add on this topic - I think it is a misunderstanding that the whole enterprise / department / subject area need to be fully modelled before starting implementation.
With data modelling patterns like Data Vault, Anchor, Focal or any other form of Ensemble modelling you are able to start implementing a small part of the model while the modelling is ongoing. These parts can also be tested and used for reporting analysis without worrying to redo all the work when other parts are added. This has to do with the separation between identifiers, relationships and context (Unified Decomposition).
When I discovered this way of modelling it really made a big difference in progress, perception and communication.
"...start implementing a small part of the model while the modelling is ongoing. These parts can also be tested and used for reporting analysis without worrying to redo all the work when other parts are added."
Agree. Don't have to boil the ocean or data lake. 80%+ we are not building the entire house, maybe only adding an addition, or covered porch. Very sprint-able. Or in the other analogy, not building towns plans, just a house in the town plan.
Of course, it is much easier and faster to build an addition on a well architected house with a blue print!! ;).
If we are talking about speed, we need to think reuse. A decent modeling tool will build up a library of domains (Address, Amount, ID, Quantity, Percent, Name, etc.) and a library of specific attributes that can be reused across tables, as well, a repository of standardized names, descriptions, comments, data types, etc. This library is what makes data modeling go faster and be more consistent. Capitalize on the one to many relationship. Heck, 25% of transactional data models are columns repeated across tables. 50%+ are probably repeated across analytical models - esp. when you throw in medallion architecture. So, net-net, doing the activities of data modeling can speed up the creation of models, standardize, and improve quality and usability.
And, per Joe's "AI to the Rescue" post, make use of AI Chats/Agents to help save time and speed up productivity. My caveat here is getting the requirements/current business landscape in that domain, well understood. And that takes the right questions, a back and forth conversation. Get the foundational definitions of most every entity and attribute. For AI to work better, accurate metadata, metadata, metadata. Find the right person, what used to be a business analyst in the "olden" days, to capture a business glossary info - it is the foundational 1 to many of data modeling. Better than the "Jelly of the Month" club membership...that's the gift that keeps on giving the whole year - Cousin Eddie - Christmas Vacation. :)
Highly recommend Lawrence Corr's Agile Data Warehouse Design book in case anyone hasn't read it already. It's very much a Just Enough Design Upfront (JEDUF) approach that encourages iteration.
It’s gold
Well said Joe, to add on this topic - I think it is a misunderstanding that the whole enterprise / department / subject area need to be fully modelled before starting implementation.
With data modelling patterns like Data Vault, Anchor, Focal or any other form of Ensemble modelling you are able to start implementing a small part of the model while the modelling is ongoing. These parts can also be tested and used for reporting analysis without worrying to redo all the work when other parts are added. This has to do with the separation between identifiers, relationships and context (Unified Decomposition).
When I discovered this way of modelling it really made a big difference in progress, perception and communication.
"...start implementing a small part of the model while the modelling is ongoing. These parts can also be tested and used for reporting analysis without worrying to redo all the work when other parts are added."
This is so key
Agree. Don't have to boil the ocean or data lake. 80%+ we are not building the entire house, maybe only adding an addition, or covered porch. Very sprint-able. Or in the other analogy, not building towns plans, just a house in the town plan.
Of course, it is much easier and faster to build an addition on a well architected house with a blue print!! ;).
If we are talking about speed, we need to think reuse. A decent modeling tool will build up a library of domains (Address, Amount, ID, Quantity, Percent, Name, etc.) and a library of specific attributes that can be reused across tables, as well, a repository of standardized names, descriptions, comments, data types, etc. This library is what makes data modeling go faster and be more consistent. Capitalize on the one to many relationship. Heck, 25% of transactional data models are columns repeated across tables. 50%+ are probably repeated across analytical models - esp. when you throw in medallion architecture. So, net-net, doing the activities of data modeling can speed up the creation of models, standardize, and improve quality and usability.
And, per Joe's "AI to the Rescue" post, make use of AI Chats/Agents to help save time and speed up productivity. My caveat here is getting the requirements/current business landscape in that domain, well understood. And that takes the right questions, a back and forth conversation. Get the foundational definitions of most every entity and attribute. For AI to work better, accurate metadata, metadata, metadata. Find the right person, what used to be a business analyst in the "olden" days, to capture a business glossary info - it is the foundational 1 to many of data modeling. Better than the "Jelly of the Month" club membership...that's the gift that keeps on giving the whole year - Cousin Eddie - Christmas Vacation. :)