25 November 2014

Why Not Be Normal? [update]

There is a recent thread on simple-talk dealing with stored procedure as the repository (execution environment) for business logic. One commenter took the opportunity to assert that such a horrible mistake gave rise to NoSql; may be not entirely. And, of course, the implied inferiority of SQL databases, normal forms, and the RM paradigm by implication. Bollocks, again. NoSql is yet another failed, one would dearly hope, attempt by client-side coders to return to the thrilling days of yesteryear, when a lone ranger coder siloed all the data management in his code. And so forth.

The argument against 5NF boils down to: joins are too slow. One might add that most application developers are too dense to understand normalization, and that they prefer the employment security inherent in client-side data management. In other words, it ain't tech it's politics.

The point of these endeavors has always been that tech as it has emerged over the last half-decade or so gives us the tools to implement the RM as Dr. Codd intended. And, of course, that we'd be idiots not to seize the opportunity.

As node sizes have dipped toward 10nm, we get multi-billion transistor cpu/gpu implementations, motherboards with multiple sockets and four or more memory slots. For under $10K, one can build an awful lot of server. That same $10K will support about one month of a client-coding developer. If that doesn't conjure up an image of Jabba the Hutt on one end of the seesaw and Bruce Lee on the other, here's another reason.

Intel has released details of its next 3D NAND, terabytes are now on offer. Fast terabytes. There's just no excuse for all the mess and heartache caused by flat earth folks.

The comments remind me of the core irony of the conflict from coders toward NF data: the OO crowd always talks about two principles that they hew to, no matter what. First is DRY. The second, related, is that objects and methods must deal with as limited a scope as can be conceived. One java coder of my recent acquaintance used to say that his methods were never "more than a few lines long", as if this were the ultimate virtue. "Do one thing, and do it right" is what we are told. No bloat. No repetition. No duplication. It isn't much of stretch to see that both attitudes applied to data must yield NF schemas and the DRI which results. Yet, they insist on silos of flat-file (or xml, which amounts to the same thing) data.

My Pappy used to tell the old wheeze about the miner, the mule, and the eastern fop. A miner comes into Busted Flats to stock up on supplies. He ties up his horse and the mule to the rail. Or tries to. The mule stops dead in the muddy street. Standing on the walk by the general store is an eastern fop, just off the train. The miner pulls at the mule for a bit, then tires of the effort. He reaches into the pack, pulls out a 2 by 4 and whacks the mule upside the head.
Fop: "You can't do that!!!"
Miner: "Why not? I gotta get his attention somehow."
The mule walks up to the rail, gentle as you please.


Anonymous said...

Probably won't get any application developers to see the light by calling them dense. Depends on the outcome you are after.

Robert Young said...

At some point, after all these decades of obstructionism from the flat earth people, calling a spade a spade is our last resort. That crew refuses to acknowledge that set theory even exists. And so forth.

Tough love will flip a few; the rest haven't been converted by the RM/SQL folks willingness to genuflect to their blindness, so continuing that tack won't get any more results now than it has in the past.

Peter Row said...

I'm an app dev and I absolutely acknowledge that set theory exists and hate the very idea of using an ORM (at all really) to do something when I know that I could do the same thing in a single query in an SP.

However this tropical paradise people/you keep imagining where by you have a DBA who develops the schema and related DB objects that your application/product then uses the results of that doesn't exist anywhere I have seen.

I may not come up with 5NF database schemas (normally 3NF) but is that really needed in all cases? Typically the answer to most SQL questions is "it depends" as I suspect the case is with this.

And as the anonymous said 29/11/14 patronising people won't get you anywhere.

Robert Young said...


It's a matter of perspective and history. Kowtowing to the flat earth and hierarchy folks by sugar-coating the error of those data structures, over the last 30 years, hasn't done any good. The flat earthers, in particular, who are client-side coders (nearly) entirely are merely seeking hegemony for the purpose of perpetual employment.

The notion that Organic Normal Form™ schemas can't be built is retained by the coding set due to the history of spinning rust. We don't have to do that any more. The notion that only client-side code can manage data integrity is even older: at least to COBOL before CICS. That's a long time ago. Dr. Codd provided a true data model (the network and hierarchical 'models' were post-hoc arguments), not just an arbitrary structure.

Two tales are instructive. Allen Holub's "Bank of Allen" series discusses the issue with doing data management on the client in the specifics of ATM ROMs. The other is more recent, Sarah Mei's experience with MongoDB. In both cases, if the data matters the conclusion is that data integrity only works smoothly if it resides with the data.

Modern hardware removes the performance issue with normal forms. Since this performance issue is the titular justification for client-side data management, even the client coders have to admit, were they honest, that data management no longer is best served on the client. QED.