This is Part 1 of a two-part series on the relational model. In Part 1, we cover the theory behind the relational model. Part 2 describes the motivations and process of normalization and some considerations about the relational model.
Part 1 will be for subscribers only. Part 2 will be free for all users, as I feel like normalization is a dying art among people who work with data.
The relational model will be treated as a single chapter in the application modeling section of the upcoming data modeling book. It currently has around 40 pages! Other chapters in this section will include non-relational and event stream modeling. Stay tuned!
Also, this is a draft and might change. Thanks for understanding.
Thanks,
Joe
Fundamentals of the Relational Model
When Edgar F. Codd introduced the relational model in 1970 in “A Relational Model of Data for Large Shared Data Banks,” he approached data modeling from the first principles of data independence and mathematics. As you learned earlier, data independence means decoupling the logical data model from the underlying software application or database system.
The relational model builds upon fundamental primitives. We’ll examine how data is organized into relations, tuples, and attributes. Then, we’ll explore some core relational operations, how data is related through keys and joins, and how data is constrained.
Primitives: Relations, Tuples, Attributes
Like many things in data modeling, a brief historical context will help you better understand the original motivation of the relational model. In his 1970 paper, Codd starts the first paragraph with, “This paper is concerned with the application of elementary relational theory to systems that provide shared access to large banks of formatted data.”
First off, what is relational theory? Relational theory studies the relationships between sets of elements. It has deep roots in mathematics, predating its use in database theory and systems. A brief dive into the mathematical foundations of the relational model will help us understand why Codd chose this particular approach.
There's a relational theory, but what is a relational model? At a high level, the data in a relational model is a collection of relations, attributes, and tuples. In your work or studies, you’ve likely encountered tables. We’ll return to tables shortly, but at a high level, you can think of a relation as a table: each row is a tuple, and each column is an attribute. As you’ll see, these don’t line up perfectly, so let’s first look at the relational model from first principles, then return to how it extends to tables.
To appreciate the relational model, let’s first explore the theoretical and mathematical underpinnings of relations, tuples, and attributes. The first building block of the relational model is the relation. Admittedly, this term can be confusing. What is a relation?
Keep reading with a 7-day free trial
Subscribe to Practical Data Modeling to keep reading this post and get 7 days of free access to the full post archives.