What are the thoughts on like, examples of different data modelling techniques? Feels like going into the intricacies of something like Data Vault, for example, would be out of scope (fair) but an example or high level explanation might be cool (with examples).
And perhaps a section on what *not* to do ("How to avoid chaos"). Otherwise can't wait
Every now and then I see tools trying to provide a semantic layer on top of their somewhat proprietary modeling standards (LookML, dbt). Is that something you will talk about in your book as well?
I'm trying to keep the book as technology/tool agnostic as possible. If I mention tools, they'll be in the context of "as of the time I write this." That said, semantic layers will be covered
That’s great and please include me in one of your podcasts if you can so that we can discuss on data products 😀. I am actually working on data products
I like the outline for Part 3 (chapters 11, 12, 13, and 14). I am looking forward to more as it becomes available!
Some thoughts come to mind in the context of the outline:
I'm having an ongoing discussion [debate] about data acquisition strategy for analytics data modeling. Application architecture believes data for analytics can be sourced (or integrated, to use their words) from enterprise events (i.e., EDA, business process originated). The argument goes a canonical data model can be relevant for all downstream data needs (application integration, analytics, etc.), and who is better than the application owner to define what data should be published to an event stream? Also, having all raw data available for analytics data modeling is a non-starter as they argue that much of the data is never needed, so moving all the raw data to analytics acquisition is a wasted resource.
My POV is different:
1. Defining a canonical data model that satisfies application integration and analytics data modeling is challenging. It will get overly bloated and complicated. I instead think of analytics data modeling as a different use case, perhaps with some overlap with application data modeling.
2. We typically acquire all the source data for analytics using the most robust, cost-effective approach with minimal effort, making it available for data modeling where value creation occurs.
3. Application owners usually don't have the experience to assess the data needed for analytics data modeling, let alone machine learning.
4. Not all valuable application data can be sourced from a business process-related activity (i.e., EDA).
I see a potential for a ven diagram that shows some slight overlap of what I call data events (i.e., CDC; primarily for analytics use cases) vs. business process events (i.e., EDA: primarily for application integration use cases).
What are the thoughts on like, examples of different data modelling techniques? Feels like going into the intricacies of something like Data Vault, for example, would be out of scope (fair) but an example or high level explanation might be cool (with examples).
And perhaps a section on what *not* to do ("How to avoid chaos"). Otherwise can't wait
Stay tuned
yummy
How to buy this book?
The launch date is still TBD. Probably Q1 2025.
In the meantime, you can get early draft chapters on this substack
Every now and then I see tools trying to provide a semantic layer on top of their somewhat proprietary modeling standards (LookML, dbt). Is that something you will talk about in your book as well?
I'm trying to keep the book as technology/tool agnostic as possible. If I mention tools, they'll be in the context of "as of the time I write this." That said, semantic layers will be covered
I would like you to please include data product data modeling as well
yep. that's going to be an undercurrent and theme throughout the book
That’s great and please include me in one of your podcasts if you can so that we can discuss on data products 😀. I am actually working on data products
Looks good. Especially interested in the History chapter and the Analytical modeling chapters.
I like the outline for Part 3 (chapters 11, 12, 13, and 14). I am looking forward to more as it becomes available!
Some thoughts come to mind in the context of the outline:
I'm having an ongoing discussion [debate] about data acquisition strategy for analytics data modeling. Application architecture believes data for analytics can be sourced (or integrated, to use their words) from enterprise events (i.e., EDA, business process originated). The argument goes a canonical data model can be relevant for all downstream data needs (application integration, analytics, etc.), and who is better than the application owner to define what data should be published to an event stream? Also, having all raw data available for analytics data modeling is a non-starter as they argue that much of the data is never needed, so moving all the raw data to analytics acquisition is a wasted resource.
My POV is different:
1. Defining a canonical data model that satisfies application integration and analytics data modeling is challenging. It will get overly bloated and complicated. I instead think of analytics data modeling as a different use case, perhaps with some overlap with application data modeling.
2. We typically acquire all the source data for analytics using the most robust, cost-effective approach with minimal effort, making it available for data modeling where value creation occurs.
3. Application owners usually don't have the experience to assess the data needed for analytics data modeling, let alone machine learning.
4. Not all valuable application data can be sourced from a business process-related activity (i.e., EDA).
I see a potential for a ven diagram that shows some slight overlap of what I call data events (i.e., CDC; primarily for analytics use cases) vs. business process events (i.e., EDA: primarily for application integration use cases).
Thoughts? Experiences to share? Challenges?