Dr. Codd Was Right: Cog in a Gear

Vaccinated ≠ Not Infectious

$536,800,000 MARF^™(party like it's 1829)
-- New York Slammers, so far/2024 [GA and The Feds still to come]

Covid-19 has killed at least 1,123,836 people (as of 20 March 2023, final update)

Scientists aren't vocal enough about science. There are large groups of people who think scientists are all frauds and who don't believe in science, and they're being cultured by some of our far right-wing politicians, religious leaders and community leaders.
-- Drew Weissman/2022

To date, Microsoft is stating that organizations testing In-Memory OLTP have seen transaction speeds improve by up to 30 times compared to past performance, with the best performance gains achieved when the business logic resides in the database and not in the applications.
-- Jonathan Watts/2015 [my emphasis]

I have had to explain and re-explain and re-explain and re-explain, you know, how relational databases work, what is an eigenvector, what is dimensionality reduction.

-- Christopher Wylie/2018

... but Flash-based storage has such a different performance profile from rotating media, that I suspect that it will end up having a large impact on filesystem design. Right now, most filesystems tend to be designed with the latencies of rotating media in mind.

-- Linus Torvalds/2007

I believe quite strongly that, if you think about the issue at the appropriate level of abstraction, you're inexorably led to the position that databases must be relational.

-- Chris Date/2009

This week's thought

[The intelligence community] argued that [the restrictive amendment] would cripple the program because they typically use it before there is enough evidence to meet a standard of probable cause for a warrant, like early in investigations when they are trying to learn more about a phone number or an email account found to be in contact with a suspected foreign spy or terrorist.
-- Charlie Savage/2024 [IOW, let's catch the thief before the horse is out of the barn - the MARF^™(party like it's 1829) traitor's worst nightmare]

Therefore:

In a time of SSD, multi-core/processor, two terabyte memory and Optane App Direct Mode machines, there is no reason not to build from BCNF data. Time to do what Dr. Codd demonstrated. Technology has finally caught up with the maths.

15 April 2016

Cog in a Gear

This endeavor started out both as a reaction to my being rejected for my efforts to build Organic Normal Form™ databases at my then immediately former employer, and the fact that Fabian Pascal had gone missing from the innterTubes. I believed then, as now, that Codd had defined the sole data model. Previous, and au courant, data structures are merely that; they're not models of data. Moreover, the NoSql and xml and sundry flatfile offerings were just re-hashes of pre-Codd engines. In particular, the fascination with hierarchy as implemented in xml is just IMS without any engine. And IMS, with its horrid control structures and coding black hole was specifically the motivation to devise the Relational Model. Alas, Armonk wasn't pleased, coming only a couple of years into IMS's release, and allowed one of the IMS crowd to managed SQL development. Dr. Codd was right, damn it, and somebody should stand up and say so!

Over time, my interests wandered back to where I had spent my initial career: stats and quant. Not least because it was clear early on that The Great Recession had been caused by incompetent and corrupt quant. Some insist, still, that the quants were only following orders, and thus not to blame. The fact is, though, that Country Wide's quants had a major role in devising the toxic mortgages that came to be securitized. Within the last week, both Wells and Goldman have admitted fudging the disclosures regarding those securities (but some underwriters, nee quants, created the ratings), so one can argue that it wasn't just quants.

Then along came Watson, and what IBM now calls Cognitive Analytics. It seems, at this juncture, to be a branding effort by IBM, although other sites do appear in search. The Wiki doesn't yet have a page on that term. Go to it. (Oddly, the term was trademark, but is now listed as abandoned?)

What's of interest is that Date's statement is more true than ever:

I believe quite strongly that, if you think about the issue at the appropriate level of abstraction, you're inexorably led to the position that databases must be relational.
-- Chris Date/2009

There's some controversy, still, about the notion of relations in data. The RM, on the one hand, makes relations the province of the data designer to be specified a priori; an order line must have a resolved foreign key to an order table. However, if normality/orthogonality is followed from the start, changes to schema are transparent to existing data and code. The RM/RDBMS is the only data store which provides that flexibility.

The quant (Big Data maven), on the other hand, proposes to discover, in a probabilistic manner, correlations in the data. Whether those correlations are, de jure, relations is the crux. I say, not, but then I've always been a tad rebellious (but a Blue Yankee). All of the other data structures (whether network, or graph, or ...) follow from hierarchy: the structure specifies the one optimal access path through the data. Get the structure wrong, and you're up a creek. The ignorant complain that the RM is confining, not realizing (or refusing) that in the RM, connections among "records" is expressed in data, which is fungible, while in hierarchy the connections are explicit in the tree (not fungible without some off-line effort), which the designer has pre-specified.

Which brings us to Watson and Cognitive Analytics. What's the goal? Near as I can tell, it's needle hunting in haystacks higher than Everest. Those had better be very gold needles. After all, one benefit of the RM is to be able to infer new facts from the specified relations based on data.

Somewhat ironically, at least to me, is the following sentence in the Wiki article:

Are intelligent machines dangerous? How can we ensure that machines behave ethically and that they are used ethically?

As if humans routinely behave ethically! With the rise and vengeance of the 1%, ethics don't matter much. After all, it's just business.

The notion and practice of artificial intelligence, computerized division, is generally accorded to McCarthy and LISP. Prolog came a bit later. Neither has made much of a dent in IT, although Watson is reported to use some bit of Prolog. Will Watson itself? IBM is said to be betting on it. That link, if you don't go there, is from two years ago.

Watson's only real value-add is the ability to observe, then discard, 99.9967853% of text it "sees" in very short time spans. In other words, "Jeopard!" style infotainment and outright entertainment. In a clinical setting, that amounts to ER settings where diagnostic time does matter. The machine is far too expensive to install in hospitals, so the cloud version shared among hundreds of ERs might help. Humans, experts in their fields, know to not bother with the 99.9967853% in the first place. Watson is House on steroids, which is ironic I suppose since the whole point of "House" the TeeVee show was that House the character was entirely drug addled. For day-to-day research, not so much. Collaboration with real humans is more important.

Finally, IBM is co-opting its own history. Thomas Watson, Sr. invented IBM's signature meme, "THINK" at NCR. Externally, that was supposed to tell potential clients that IBMers used their heads for something besides a hat rack. In fact, the purpose internally, was to motivate IBMers to devise ever more clever and lucrative ways to separate clients from their money. Now the meme is "outthink". We'll see. May be Watson will succeed in being a very nice suit of Emperor's New Clothes?

Dr. Codd Was Right

Vaccinated ≠ Not Infectious

About

Shameless Plug

Extended Pieces

Good Stuff

Followers

Blog Archive

15 April 2016

Cog in a Gear

No comments: