The Relational Model, Part 2

Joe Reis

Jan 6

Normalization and Data Design

Read →

29 Comments

Sam

Jul 21

From the 3NF example: "Category depends on the ProductID, and can be split into its own table."

I absolutely get that this needs to happen, but I don't understand how this is described by the 3NF rule ("no non-key attribute is transitively dependent on the primary key") - is Category transitively dependent on the ProductID? It seems like Category is functionally dependent on the Product ID, no? Is there a 3NF rule that helps to describe this more clearly?

Expand full comment

Reply (1)

Joe Reis

Jul 21

thanks, let me have a look. This chapter likely needs a lot of clarification in quite a few areas.

Expand full comment

Ramona C. Truta

Jan 12

I have a few questions on the 3NF and BCNF examples.

1. Is Supplier_Capabilities part of the 3NF decomposition of Suppliers? There are attributes here that were not in the original.

2. Similarly, in Supplier_Shipping_Capabilities there's a new attribute, ShippingCapabilityID that is new.

3. I'm wondering if having a range (2-3) for EstimatedDays does not contradict the 1NF indivisibility.

Do you intend to include the decomposition algorithms? Or, perhaps, elaborate more on how the decomposition was obtained?

Here's a tip for checking the decomposition: performing the union of the attributes of the decomposed relations should produce the set of attributes of the original relation.

Expand full comment

Reply (1)

Joe Reis

Jan 12

Good questions. Let me have a look this week. Thanks Ramona

Expand full comment

Ramona C. Truta

Jan 12Edited

I don't agree with transitive dependencies as a separate category - but knowing the theory I understand why one can argue that they are a separate subcategory. I think you should include the Armstrong's axioms, which transitivity is part of. These 3 axioms govern how we can infer other FDs, and they are a sound and complete set of rules. Based on these axioms we can compute the closure of a set of FDs.

I also want to caution on the language on the definition "Dependencies are *often* denoted as X -> Y, where X is a set of attributes that *might* determine the value of Y. ":

1. they are denoted (remove the "often" - that's the notation), and

2. it does determine (remove the "might").

More to come as I read.

Expand full comment

Reply (1)

Joe Reis

Jan 12

It's true that transitive dependencies are also functional dependencies. I'm trying to balance what people need to know in the field against endless axiomatic examples (trust me, I've seen Armstrong and many others) that scare people from learning relational modeling.

Expand full comment

Reply (2)

Joe Reis

Jan 12

This is the "if you're gonna fight combat or MMA, this is what you need to know", versus "if you want to get a phd in this topic."

Expand full comment

Reply (1)

Ramona C. Truta

Jan 12

Good point - providing the min set of info, I want to call it the candidate info as a reference to candidate key :)

Expand full comment

Ramona C. Truta

Jan 12

No doubt. It makes you wonder, isn't it? When things are done by "ear" rather than following a systematic set of rules...

Expand full comment

Reply (1)

Joe Reis

Jan 12

it's mostly by ear today. Let's make it better

Expand full comment

Donald Parish

Jan 7

Also, for https://practicaldatamodeling.substack.com/i/154215972/third-normal-form-nf, if the Suppliers table is in 3NF, it's not the best design perhaps? Not sure what the key is?, but should be on SupplierID for 3NF? Hope that helps, not trying to be too critical. :)

Expand full comment

Reply (1)

Joe Reis

Jan 7

actually, the suppliers table should be split further. There's a dependency I overlooked. Thanks.

Expand full comment

Donald Parish

Jan 7

In section https://practicaldatamodeling.substack.com/i/154215972/boyce-codd-normal-form-bcnf, I don't see why the Supplier Name and Supplier Phone tables are necessary for BCNF. I'd leave them with Supplier table. That sort of break out looks more like 5NF to me, with the binary type relations. A bridge too far, in my opinion. But I'm certainly not an expert. I just try to read (and re-read C. J. Date) :)

Expand full comment

Reply (2)

Joe Reis

Jan 7Edited

yeah, agreed. They can be combined and be bcnf compliant. Thanks

Expand full comment

Joe Reis

Jan 7

and actually, shipping zones can also be further fixed, as it's not bcnf as is

Expand full comment

Gabriel

Jan 6

"Normal Forms Are Hierarchical"

I wouldn't say it like this.

For me, a hierarchy includes some many-to-one relationships.

I would not say that any NF is a "parent" of another one in a hierarchy.

What I would say is that a NF is a predecessor of another one in the sequence / chain.

So: "Normal Forms are Sequential" (from "weaker" to "stronger") instead of "Hierarchical".

Expand full comment

Reply (1)

Joe Reis

Jan 6Edited

"Normal Forms are Sequential" (from "weaker" to "stronger") instead of "Hierarchical".

That's literally a hierarchy you just described...

Also, if you read the historical record of the relational model, the progression of 1NF to any other form is deliberately called a hierarchy.

Expand full comment

Dave

Jan 6

That is the challenge and I see as a core issue. The PDM is actually not what describes the data from a business perspective. NoSQL, Streaming are technical solutions to C-R-U-D the business data.

Expand full comment

Reply (1)

Joe Reis

Jan 6

This will be a tough thing to change. Engineers either don’t acknowledge data modeling at all, or think they do it when they cram data into a stream. Definitely a huge motivator for this book.

Expand full comment

Clivado

Jan 6

Nice article, Joe. It might be worth mentioning that 6NF is also known as Graph Normal Form, as there are emerging graph-relational tech companies using this NF that readers might come across.

Expand full comment

Reply (2)

Joe Reis

Jan 6

How common do you think thjs will be in 5 years?

Expand full comment

Reply (1)

Clivado

Jan 6

I think relational always seems to win out, and the RM + SQL will adapt in the future to absorb new ideas just as it has in the past (e.g. JSON and Property Graph Query in SQL 16 and 23). So I think SQL systems could evolve to incorporate graph features, instead of them being separate database systems as they are now, meaning this could be come much more prevalent. I have no idea just how common that might be though!

Expand full comment

Joe Reis

Jan 6

Good call

Expand full comment

Marco Sollie

Jan 6

Hi Joe,

I think you linked the "SQL and Relational Theory" to one of your other articles rather than to the Date's book link. Great article. Thanks!

Expand full comment

Reply (1)

Joe Reis

Jan 6

Thanks Marco. Fixed.

Expand full comment

Thirtha

Feb 25

Hi Joe,

As per the explanation of transitive dependency the columns ‘ProductId’, ‘ProductName’, ‘Price’ are transitively dependent right in terms that if we know the Id and name we can determine the price?

Expand full comment

Dave

Jan 6

Hi Joe. Great content on part 2. Remember that normalization is started within an LDM and not a PDM. So columns do not exist. Attributes do. As an example, I can design and document all the reference entities (“Code Tables) and show all the relationships in a 3NF view. At the physical level, I can create one “Code Table” table to capture and manage all the data within the many logical reference entities. This is not 3NF anymore. I believe this is what is continually skipped and jumping to the end result of a PDM / physical database implementation is a critical issue we are facing. “Transformation” of the LDM into a PDM is not always a one to one view. Components of LDM can be implemented in various normalization levels depending on various constraints and implementation decisions.

Expand full comment

Reply (2)

Joe Reis

Jan 6Edited

That said, like the relational model, CDM and LDM are dying arts as well. Most people start and end with PDM. Much of what I cover in this book acknowledges this reality, and hopefully educates people on the broader implications of only focusing on PDM. But many of the upcoming examples and chapters (NoSQL, Streaming) recognize that there are certain physical systems that exist for a reason. Sadly, part of this book is about meeting people where they are, which is mostly at the physical layer these days.

Expand full comment

Joe Reis

Jan 6

Thanks and great call-out. This will be covered in the CDM/LDM/PDM section, which should be up around next month.

Expand full comment

Practical Data Modeling

The Relational Model, Part 2