Dr. Codd Was Right: But, I can see Russia from Alaska

Lisa Murkowski, Swamp Critter

The world is not linear.
-- Dr. McElhone/1974

Power tends to corrupt; absolute power corrupts absolutely.
-- Lord Acton/1887

Officials who use their public positions for private gain threaten the integrity of our most important institutions. Greed makes governments — at every level — less responsive, less efficient and less trustworthy from the perspective of the communities they serve.
-- Justice Ketanji Brown Jackson/2024 [the MAGAnauts get even more aggressive when their perfidy is exposed]

I think we are on the verge of losing vaccines for this country, from this country. And the reason is that Robert F. Kennedy Jr. will hold up a paper, in the next four or five months, that says it's aluminum in vaccines that are causing a whole swath of problems, including autism. I think he is about to destroy vaccines in this country. I do.
-- Dr. Paul Offit/2025 [may the MAGA and MAHA be with you]

There's not a single example of things working out for the appeaser.
-- Nicolle Wallace/2024 [like this? the next extortion is on the way]

I have had to explain and re-explain and re-explain and re-explain, you know, how relational databases work, what is an eigenvector, what is dimensionality reduction.

-- Christopher Wylie/2018

... but Flash-based storage has such a different performance profile from rotating media, that I suspect that it will end up having a large impact on filesystem design. Right now, most filesystems tend to be designed with the latencies of rotating media in mind.

-- Linus Torvalds/2007

I believe quite strongly that, if you think about the issue at the appropriate level of abstraction, you're inexorably led to the position that databases must be relational.

-- Chris Date/2009

This Week's thought

D.O.J. has the full power of the federal government behind it. And under the guise of election integrity, they could end up using their unique tools to introduce new vulnerabilities to the system.
-- Dax Goldstein/2025 [the Office of Data Integrity will leverage all of D.O.J. to steal every election; Paramount just caved to extortion]

See you next week in a brand new show^{©Heckle and Jeckle}

Therefore:

In a time of SSD, multi-core/processor, two terabyte memory and Optane App Direct Mode (RIP) machines, there is no reason not to build from BCNF data. Time to do what Dr. Codd demonstrated. Technology has finally caught up with the maths.

07 June 2009

But, I can see Russia from Alaska

As Sarah said, you can see Russia from Alaska. With regard to SSD and relational databases, a view is going to motivate another major change to how we build database applications.

During this down time, I have been keeping track of the producer of SSD, STEC (both its name and stock symbol; it started out as Simple Technology), and they continue to announce agreements with large systems sellers, HP being the newest. A notion has begun to bubble in my brain. Said notion runs along these lines. The major sellers of hardware have gotten the SSD religion. Absolute capacity of SSD will not, in a reasonable timeframe, catch up with HDD. This puts the hardware folk in the following situation: they have these machines which can process data at unheard of speeds, but not the massive amounts of data that are the apple of the eye of flat file KiddieKoders. What to do, what to do?

The fact that STEC and Intel are making available SLC drives, with STEC targeting big server and near mainframe machines, and the machine makers taking them up on the deal means that there really is a paradigm shift in process. Aside: Fusion-io is targeting server quality drives on PCIe cards. That puzzled me for a bit, but the answer seems clear now. Such a drive is aimed squarely at Google, et al. The reason: the Google approach is one network, many machines. A disk farm is not their answer. Fusion-io can sell a ton of SSD to them. Alas, Fusion-io is private, so no money for the retail investor. Sigh.

The paradigm shift is motivated by the TCO equation, as much as blazing speed. SSD gulps a fraction of the power of HDD, and generates a fraction of the heat and thereby demands a fraction of the cooling. So, total power required per gigabyte is through the floor compared to HDD. There is further the reduced demand to do RAID-10, since the SSD takes care of data access speed in and of itself. So, one can replace dozens of HDD with a single SSD, which then sips just a bit of juice from the wall plug. The savings, for large data centres, could be so large that, rather than just wait to build "green fields" projects on SSD/multi machines as they arise in normal development, existing machines should be just scrapped and replaced. Ah bliss.

But now for the really significant paradigm shift. My suspicion is that Larry Ellison, not Armonk, has already figured this out. You read it here first, folks. There was a reason he wanted Sun; they have been doing business with STEC for a while.

The premise: SSD/multi machines excel at running high NF databases. The SSD speeds up flat file applications some, but that is not the Big Win. The Big Win is to excise terabytes from the data model. That's where the money is to be made. The SSD/multi machine with RDBMS is the answer. But to get there, fully, requires a paradigm shift in database software. The first database vendor to get there wins The Whole Market. You read that here first, too, folks.

There have been two historical knocks on 4/5NF data models. The first is that joins are too slow. SSD, by itself, slays that. With the SSD/multi machine, joins are trivial and fast. These models also allow the excision of terabytes of data. The second is that vendors have, as yet, not been able (or, more likely, willing) to implement Codd's 6th rule on view updating. Basically, only views defined as a projection of a single table are updatable in current products. This is material in the following way. In order to reap the maximum benefit of the SSD/multi machine database application, one needs the ability to both read and write the data in its logical structure, which includes the joins. In the near term, stored procedures suffice; send the joined row to the SP, which figures out the pieces and the logical order to update the base tables.

But what happens if a database engine can update joined views? Then it's all just SQL from the application code point of view. Less code, fewer bugs, faster execution. What's not to love? The pressure on database vendors to implement true view updating will increase as developers absorb the importance of the SSD/multi machines, and managers absorb the magnitude of cost saving available from using such machines to their maximum facility.

Oracle will be the first to get there just because IBM seems not to care about the relational aspects of its databases. The mainframe (z/OS) version is just a COBOL file handler, and the LUW version is drinking at the xml trough. Microsoft doesn't play with the big boys, despite protestations to the contrary. With the Sun acquisition, Oracle has the vertical to displace historical IBM mainframes. Oracle has never run well on IBM mainframe OSs. Oracle now has the opportunity to poach off z/OS applications at will. It will be interesting.

3 comments:

PaulM said...: Nice article. I am surprised noone has commented already.
It has always been a physical implementation issue i.e. people bag relational databases and the relational model is throw out as well.; June 8, 2009 at 7:36 AM
Anonymous said...: This comment has been removed by a blog administrator.; June 8, 2009 at 12:42 PM
Robert Young said...: Well, I'm not as famous as Date or CELKO, although I had a usenet chat with him about SSD and databases. But there is a comment a couple of posts back that is amusing.; June 8, 2009 at 12:45 PM

Dr. Codd Was Right

Lisa Murkowski, Swamp Critter

About

Shameless Plug

Extended Pieces

Good Stuff

Followers

Blog Archive

07 June 2009

But, I can see Russia from Alaska

3 comments: