Dr. Codd Was Right: Frenemies

Lisa Murkowski, Swamp Critter

The world is not linear.
-- Dr. McElhone/1974

Power tends to corrupt; absolute power corrupts absolutely.
-- Lord Acton/1887

Officials who use their public positions for private gain threaten the integrity of our most important institutions. Greed makes governments — at every level — less responsive, less efficient and less trustworthy from the perspective of the communities they serve.
-- Justice Ketanji Brown Jackson/2024 [the MAGAnauts get even more aggressive when their perfidy is exposed]

I think we are on the verge of losing vaccines for this country, from this country. And the reason is that Robert F. Kennedy Jr. will hold up a paper, in the next four or five months, that says it's aluminum in vaccines that are causing a whole swath of problems, including autism. I think he is about to destroy vaccines in this country. I do.
-- Dr. Paul Offit/2025 [may the MAGA and MAHA be with you]

There's not a single example of things working out for the appeaser.
-- Nicolle Wallace/2024 [like this? the next extortion is on the way]

I have had to explain and re-explain and re-explain and re-explain, you know, how relational databases work, what is an eigenvector, what is dimensionality reduction.

-- Christopher Wylie/2018

... but Flash-based storage has such a different performance profile from rotating media, that I suspect that it will end up having a large impact on filesystem design. Right now, most filesystems tend to be designed with the latencies of rotating media in mind.

-- Linus Torvalds/2007

I believe quite strongly that, if you think about the issue at the appropriate level of abstraction, you're inexorably led to the position that databases must be relational.

-- Chris Date/2009

This Week's thought

D.O.J. has the full power of the federal government behind it. And under the guise of election integrity, they could end up using their unique tools to introduce new vulnerabilities to the system.
-- Dax Goldstein/2025 [the Office of Data Integrity will leverage all of D.O.J. to steal every election; Paramount just caved to extortion]

See you next week in a brand new show^{©Heckle and Jeckle}

Therefore:

In a time of SSD, multi-core/processor, two terabyte memory and Optane App Direct Mode (RIP) machines, there is no reason not to build from BCNF data. Time to do what Dr. Codd demonstrated. Technology has finally caught up with the maths.

03 January 2013

Frenemies

There's that old saw: "the enemy of my enemy is my friend". I figured that this was due to Shakespeare, but the wiki says no, the adage originated either in Arabia or China. Makes sense: both cultures were way ahead of England by the time Shakespeare came around.

Each day I get an update from sqlservercentral. They're one part of the Goliath organization which published the Triage piece, so although I'm not currently doing much with SQL Server, it just seemed right. Today's feed included this link: "All Flash". Yhawza!! My little heart goes pit-a-pat. Then I scan the first paragraph, and go off to the link, and these are NoSql kiddies!! Arrgh!

The piece starts off reasonably, making the case that short-stroked arrays of HDD necessary to get the IOPS of a Samsung 840 is orders of magnitude greater than the cost of the 840. A couple of problems with that though. The 840 (not the 840 Pro) is a TLC read-(almost)only part. AnandTech tore it up and then again. While not an egregious part, it isn't by any stretch a server part. Consumer for sure; prosumer not so much.

The piece does get contradictory, however:

Flash is 10x more expensive than rotational disk. However, you'll make up the few thousand dollars you're spending simply by saving the cost of the meetings to discuss the schema optimizations you'll need to try to keep your database together. Flash goes so fast that you'll spend less time agonizing about optimizations.

This is the classic mistake: assuming that flat-file access is scalable. Of course, it isn't with consumer flash drives, and that's why the NoSql crowd find themselves in niche applications. The advantage of the RM, and the subsequent synergy with SSD, is that the RM defines the minimal data footprint. Since random I/O is the norm with multi-user servers, there's no greater penalty to normalization.

When a flash drive fails, you can still read the data.

I don't know where the author gets this from. Since each SSD controller has its own method of controlling data writing on the NAND, unlike HDD which follow standards and largely use Marvell parts (if CMU doesn't bankrupt them), data recovery is iffy. Most failures of SSD to date have been in the controller's firmware, not NAND giving up the ghost, and frequently lead to bricked parts. So, no, you can't remotely depend on simple recovery of data from SSD. While I've not seen definitive proof, SSD failure should be more predictable, which by itself is an advantage. One simply swaps out a drive at some percentage of the write limit. Modulo those firmware burps, you should be good to go until then.

Importantly, new flash technology is available every year with higher durability, such as this year's Intel S3700 which claims each drive can be rewritten 10 times a day for 5 years before failure.

Well, sort of. The S3700's consistency isn't due to NAND durability, but controller magic. It is well known that as geometry has shrunk, inherent durability of NAND has dropped. And that will continue. As I've mused before, we will reach a point where the gymnastics needed to compensate for falling P/E cycles in NAND by controllers will exceed the cost savings of smaller geometries. This is particularly true of the Samsung 840, which begins the article.

Over time, flash device firmware will improve, and small block writes will become more efficient and correct...

It's going the other way, alas. As geometries shrink, page size and erase block size have increased, not decreased. The use of DRAM caching on the drive is the common way to reduce write amplification, i.e. support smaller than page size writes. Firmware can only work around increasing page/erase block size.

What the author misses, of course, is that organic NF relational databases implement minimum byte footprint storage, get you a TPM, lots of DRI, and client agnosticism in the process. So, on the whole, it's a half right article.

Dr. Codd Was Right

Lisa Murkowski, Swamp Critter

About

Shameless Plug

Extended Pieces

Good Stuff

Followers

Blog Archive

03 January 2013

Frenemies

No comments: