Hey all! It’s been a bit. I hoped to get a lot of stuff published in March, but life had other plans for me. But here we are in mid-April, and I’m back at it. Thankfully, I haven’t been idle. I've been writing a lot for the book, and I will start posting draft sections and chapters, starting with this one. And my travel is settling down for a bit, so you’ll see much more content here.
This is part 1 of 2 that covers event data modeling. Here, events refer to the types of data you might work with within a streaming system, message queue, or similar. One can argue that all data starts as an event, but I won’t get too pedantic. Part 11 covers the fundamental patterns and ideas around events, and Part 2 will look at how to model events.
As always, comment where you wish. If you find errors, please submit them here.
Thanks,
Joe
Everything in the universe begins with an event. Something happens and creates new events. Other times, an event triggers other events. Each event is immutable and happens in an orderly fashion. The historical record remains unchanged. The past is what it is.
Even though events have always been the building block of data, data is moving faster than ever. You encounter events every day. Events drive stream processing, real-time analytics, machine learning (ML), artificial intelligence (AI), and even the transactional databases you're familiar with. In your daily life, events power mobile devices, video games, cars, planes, factories, and everything else we rely on. While the architecture of event processing and storage has received the majority of attention, the practice of data modeling for events and their patterns has received far less focus.
As a reminder, we previously learned that “In data, everything starts as an event, representing something that occurred at a specific time. An event is immutable, meaning it cannot be changed once it happens. Events capture the moment-in-time nature of an action or occurrence.” In that chapter, we covered events at a surface level. Let’s explore them more deeply in the context of data modeling. First, we’ll discuss some core components of events, then build upon these ideas into a first-principles mental framework for modeling them.
Events ARE The Data Model
In most examples in this book, we model data as a side effect or byproduct of another process. Some pejoratively refer to this as “data exhaust.” When modeling events, we must shift our perspective from viewing them as byproducts to treating the events themselves as the data model. I’ll be upfront - this may require a significant mental adjustment.
Traditional data modeling typically captures the current state—"What is. Right now." Event data modeling, however, preserves the entire journey. With event data modeling, we’re modeling the flow of data along its lifecycle. We’re capturing both "what happened" and how those events shaped "what is now." Instead of storing a record's current state as a technical artifact in a database (which may change), the event serves as the single source of truth. An atomic event is the data model itself.
Events capture naturally occurring business activities, allowing us to model data to reflect sequences of real-world operations. This differs from traditional data models, which often abstract time and sequence to focus solely on the present state. When events act as your data model, you can address not only 'What is?' but also 'Why is it that way?' and 'How did it get there?'—increasingly valuable questions in analytics, decision-making, and ML/AI.
Events are modeled to reflect real business concepts using the language of the business. When modeling events, emphasize what they signify rather than starting with entities and attributes.
Let’s start by defining elements this way when we conventionally model entities.
A Customer entity with attributes like name, email, and address.
A Product entity with attributes like name, price, and inventory.
An Order entity with attributes like status and total amount.
These entities and attributes are combined into an event (JSON). When a customer adds a product to their cart, this creates a 'ProductAddedToCart' event with immutable properties like the timestamp, product ID, and customer ID.
Here’s what this looks like as an event.
Keep reading with a 7-day free trial
Subscribe to Practical Data Modeling to keep reading this post and get 7 days of free access to the full post archives.