27 July 2011

Ohm's Law

I've gotten to enjoy Christophe Pettus' postings linked from the PostgreSQL site. He does a neat presentation. This is his latest. Note especially pages 50 and following. While he's a Python/Postgres kind of person, and I'm currently exploring RoR again (long story), he does say things the way I do. Not quite as famous as he is, of course. In the database is truth. Note in particular his observations with regard to "cloud" I/O; it's what I've always suspected. It's your data, don't treat it like a red haired step-child. The SSD is the future of normalized, i.e. fast, data. The "data explosion" is largely the result of bad (non-existent?) data modeling. Cloud is all about minimalist/commodity parts which are easily re-assignable. If anything kills off the RM, it will be public clouds. Coders get infinite employment, and the profession relives the 1960s. Sniff.

So far as that goes, what he's saying about coders abusing the database from Django is about what I've seen with coders abusing the database from RoR; may be more so, given David's attitude toward data. The problem with ORMs is that they seek to solve a problem created by OO coders, but which doesn't exist in the Real World. Such coders refer to the problem as Impedance Mismatch, which is merely an assumption that objects can't be populated with data from the RM. But it's just an assumption. What they steadfastly (shades of Tea Baggers, what?) refuse to acknowledge is that BCNF databases allow for construction of arbitrarily complex data structures, unlike the hierarchic/IMS/xml approach, which is locked in to a parent/child structure. Change that, and all the application code which manages it has to change. Well, unless you've written a bare bones RM engine into your application. Don't laugh; I've lived through folks doing just that.

The world isn't hierarchic, no matter what OO/xml folks want to assert. I've worked lots of places, small to huge, and the archetype for the hierarchic structure doesn't actually exist. That structure is the Org Chart. In the hypothetical world, each worker bee has one, and only one, supervisor. The real world is run on Matrix Management, one has supervisor du jour, never the same one each day, varies by project/location/assignment/foobar. The real world is relational, connections come and go, in vivid multiplicity. The relational model stores such natively. From this structure can be built any set of connections which arise. By *not predefining* the connections, only the absolute identities of each type/rule, one can create new relationships simply by naming new foreign keys (cross-reference tables, by various names, for many-to-many relations).

One can also add new data without (if one has been moderately smart with the DDL/SQL) clobbering any existing SQL (or, heaven help us all) application code which directly queries the DB. Existing queries can ignore, if desired, new columns and new tables; so long as one avoids 'Select * from ...', of course. You would never do that, right?

1 comment:

Anonymous said...

> (shades of Tea Baggers, what?)

Where *I* get an impedance mismatch is your playing the Krufty Konservative to the witless kiddie coders, then sneering on politics like a 20-year old sophomoric Redditor ...

(Also, please bring back the old 1950s profile pic - the imitation Pratchett, much as I like his writings, isn't that appealing.)