02 June 2018

A Hop, A Skip, and A Jump - part the second

Do you remember that Date, at least by his last version (re-gifted previous ones along the way so I can't vouch for them) of "Introduction..." included a section called "ACID dropping" in chapter 16? I wasn't convinced way back in 2003 when the edition was published. With hardware prior to Optane, updates are a serial process, thus the need for ACID.

With what's now being named "Persistent Memory", of which Optane is Intel's version, we have to re-think transaction semantics. Let's assume, for the sake of argument, that 4TB Optane machines with a slot or two of DRAM and an SSD for OS and applications storage (herein, Godzilla) are the norm in Server World. Who are the losers? I'll argue that ACID and MVCC are the most obvious candidates for being no longer relevant.

With HDD as permanent storage, the hop, skip, jump process is required to process updates. ACID is the mechanism to manage concurrency in a locking protocol (MVCC still locks, just late rather than early). With Godzilla as the platform for RDBMS systems, Codd's "all at once" is natural; no hop, skip, jump. Will it be faster than hop, skip, jump in all cases? I've no idea. We don't yet have on-the-record speeds for Optane. But my suspicion is that most of the code in relational/SQL engines goes away, particularly for schemas that are in Organic Normal Form™. Explicit locking, in the engine, won't be needed, since the OS takes care of memory locks, and that's all there are. Since updates become visible essentially immediately, the locking avoidance of MVCC is, also, not needed.

Godzilla won't be birthed for a while. *nix/Windows and datastore based applications will need to be modified, or more likely forked, to support/exploit Persistent Memory. File system no longer matters, objects (in the older sense) are all that exist. Search can, and should, be relational! No more silly one-way to the bottom hierarchies. And, that's not far fetched, since such machines existed in the past (not sure if the current progeny are the same), called AS/400, itself a development from earlier machines. The OS included an embedded RM-ish object engine, which got renamed DB2/400 for a while. In one incarnation of my employment, an AS/400 hosted application lost to one on an early RS/6000 and Progress. Worked out OK, since that company is still running the code.
Unlike the "everything is a file" feature of Unix and its derivatives, on IBM i everything is an object (with built-in persistence and garbage collection).
Remember, the hierarchy file system was concocted to support a server/terminal hosted word processor: Unix. Since Bell, as most any corporation, was built as a top-down org chart, so too was its word processor. All before Codd had released his first version, and when the state of the art datastore was IBM's mainframe IMS. Yes, the iconic hierarchical "database".

No comments: