Dr. Codd Was Right: One Classy Dame

Now We Bow to President Vance, may be?

The world is not linear.
-- Dr. McElhone/1974

Power tends to corrupt; absolute power corrupts absolutely.
-- Lord Acton/1887

Officials who use their public positions for private gain threaten the integrity of our most important institutions. Greed makes governments — at every level — less responsive, less efficient and less trustworthy from the perspective of the communities they serve.
-- Justice Ketanji Brown Jackson/2024 [the MAGAnauts get even more aggressive when their perfidy is exposed]

Here, too, powerful people are responding to authoritarian advances just as their Hungarian counterparts have — not with defiance, but with capitulation, convinced that they can maintain their independence and stay above the fray.
-- David Pressman/2025 [former Amb. there - a cautionary tale]

There's not a single example of things working out for the appeaser.
-- Nicolle Wallace/2024 [like this? the next extortion is on the way]

I have had to explain and re-explain and re-explain and re-explain, you know, how relational databases work, what is an eigenvector, what is dimensionality reduction.

-- Christopher Wylie/2018

... but Flash-based storage has such a different performance profile from rotating media, that I suspect that it will end up having a large impact on filesystem design. Right now, most filesystems tend to be designed with the latencies of rotating media in mind.

-- Linus Torvalds/2007

I believe quite strongly that, if you think about the issue at the appropriate level of abstraction, you're inexorably led to the position that databases must be relational.

-- Chris Date/2009

This Week's thought

The extreme letter sent to Harvard frankly demanded to take outside ideological control of all admissions, all faculty appointments, curriculum, and governance. It proposed a more sweeping system of control by outside ideological MAGA deciders than anything even Communists install these days to control universities. [Trump] is a would-be patrimonial corrupt ruler. The oldest kind in the world.
-- Theda Skocpol/2025

See you next week in a brand new show^{©Heckle and Jeckle}

Therefore:

In a time of SSD, multi-core/processor, two terabyte memory and Optane App Direct Mode (RIP) machines, there is no reason not to build from BCNF data. Time to do what Dr. Codd demonstrated. Technology has finally caught up with the maths.

18 January 2020

One Classy Dame - part the second

So, the SCM piece got me musing, again, on the notion of RDBMS with SCM. It would appear to be a perfect fit. But, rather than musing in the dark, I let my fingers do the walking through the Yellow Googles. Turns out that there are real products, within an inch or so past vapourware at least. Here's some bling on Oracle:

Exadata X8M uses Xeon SP CPUs, Optane DIMMs and RoCE (Remote Direct Memory Access across Converged Ethernet) over 100GbitE. RoCE enables Oracle's database to directly access persistent memory, thus bypassing the OS, network, and IO software stack.

RDBMS which do file management under the OS are not new, so Oracle's not breaking ground with that part. It also means they don't have to wait for linux to directly support such; in the sense that SCM isn't just another file.

There remains conflict on how to use SCM: either as a direct data (row in RDBMS terms) store or as an 'intermediate' filesystem surrogate such as this paper.

... replacing hard drives with SCMs often forces either major changes in file systems or suboptimal performance, because the current block-based interface does not deliver enough information to the device to allow it to optimize data management for specific device characteristics such as the out-of-place update.

As you might expect, I'll vote for something called 'object' store, or 'persistent buffer store', etc. The thrust being to eliminate all that 'impedance mismatch' that some coding folks like to throw out at RM advocates. All industrial strength RDBMS know how to do transactions within their buffers; some do the subsequent I/O to disk management themselves while others use the OS's filesystem to handle that. But both are doing the translation from 'row object' to 'file'. Why? Of course, because most (AS/400 et seq. possibly excepted) OS store data as files. It's also worth noting that the original 360, and successors, were not file oriented in the sense of *nix and successors. They're based on CKD protocol, which, if you twist your neck just right, can be viewed as a row store.

It is a self-defining format with each data record represented by a Count Area that identifies the record and provides the number of bytes in an optional Key Area and an optional Data Area. This is in contrast to devices using fixed sector size or a separate format track.

There is at least one book addressing the question. Here's a snip from the Amazon page:

Existing DBMSs are unable to take full advantage of this technology because their internal architectures are predicated on the assumption that memory is volatile. With NVM, many of the components of legacy DBMSs are unnecessary and will degrade the performance of data-intensive applications.

From the authors' earlier paper they hit the nail on the head (and confirm the notion that led my fingers to go walking through the Yellow Googles in the first place):

Consider a transaction that inserts a tuple into a table. A DBMS first records the tuple's contents in the log, and it later propagates the change to the database. With NVM, a DBMS can employ a logging protocol that avoids this unnecessary data duplication. The reason why NVM enables a better logging protocol than WAL is two-fold. The write throughput of NVM is more than an order of magnitude higher than that of an SSD or HDD. Further, the gap between sequential and random write throughput of NVM is smaller than that in SSD and HDD. Hence, a DBMS can flush changes directly to the database in NVM during regular transaction processing [15, 14, 12, 64, 40, 62, 80].
[links active in the cite]

The crux of the matter: the community has based the notion of transaction of 'slow' disk drives and 'fast' memory, with the transaction happening in memory, but only durable when flushed to disk. This boundary layer impacts to such an extent that many/most/all industrial strength RDBMS have offered the choice to do all the I/O under control of the engine, ignoring the OS facility. In the *nix world this is referred to as 'raw I/O'.

Here's another take on the meaning/purpose of SCM

This is good, of course, but it got me wondering whether it followed from requirement A — no rewrite of applications — that the solution B automatically follows — that the likes of Optane persistent memory must reside in I/O space. After all, such persistent memory was created to be directly attached to the processor chips, and be byte addressable just like RAM. Think paradigm busting, outrageously fast commits of data to persistent storage. Said differently, can a processor complex be created with directly attached persistent memory and where the typical use of that system does not require changes to the applications?

And, surprise surprise, this author remembers AS/400!

Another key — and here very applicable — concept basic to the IBM i operating system is that single-level storage. Even decades ago with the System/38, SLS meant that when your application used a secure token as an address to access data, it did not matter whether that data then was first found on disk or in RAM. Even after a system restart — say occurring due to a power failure — you restarted using exactly the same address token the address the same data.

Finally, this author, likely not by intention, stabs MVCC in the gut (fine by me)

See the difference, along with the impact on throughput as a result? The locks, required in any case since time is passing, are held for a minimum of time. The probability of any subsequent transactions seeing these locks decreases significantly. Subsequent transactions don't as often need to wait, and when they do their wait time is far less. In our train metaphor used earlier, a train doesn't even get built anywhere nearly as often. Life is good.

Die MVCC, DIE!!!

Dr. Codd Was Right

Now We Bow to President Vance, may be?

About

Shameless Plug

Extended Pieces

Good Stuff

Followers

Blog Archive

18 January 2020

One Classy Dame - part the second

No comments: