12 November 2014

Chasing Unicorns

Josh Berkus, Postgres factotum, has a new post up asking whether there should be a, and what it might look like if there were a, web/PG benchmark app. I kind of noted that web performance in the normal HTTP world doesn't have much to do with RDBMS specific performance, anyway. He wasn't impressed. Not that I mind. The notion did get me to ruminating, though.

This overview is from 2010, but confirms my experience: smart code fiddling dumb data remains the paradigm. Gad.
Most of the social networks also use Sun's MySQL database management system to organize their users' messages and status updates.

They, obviously, would rather keep writing code to keep track of things because, well, they get paid to write code. So, write code they do. There is no try, there is only code or no code.

PG has long been the "Open Source database that isn't MySql and sort of relational", so why would the PG inner sanctum want to toss that all away by jiggering PG into just another sql database which is good at storing flatfiles? Josh mentions that such a benchmark would target social networking. Again, this is the ken of those who are certain that flatfile storage is the bee's knees, at which MySql 3.0 was very good; current releases can be run that way, too. I strongly suspect they are.

Here's a chronicle (or, may be, a superbly written indictment) of a social network app that started life on MongoDB. I find it interesting. The horse punches out Mongo, this time. May be the (social networking/start up/real) world is actually relational, not flat or hierarchical, after all (Date is right, in other words)? Why chase them? Not that I've been harping on it, much.
You can also see why this is dangerous. Updating a user's data means walking through all the activity streams that they appear in to change the data in all those different places. This is very error-prone, and often leads to inconsistent data and mysterious errors, particularly when dealing with deletions.

Kiddies spend too much time learning multiple languages, and not enough time reading Date and Weikum & Vossen.
I learned something from that experience: MongoDB's ideal use case is even narrower than our television data. The only thing it's good at is storing arbitrary pieces of JSON.

Just for context, the author made a reference to Romulans to some of the kiddies she was dealing with, and got nuthin'. If they don't even know about recent history... Kiddies keep making the mistakes our grand pappies made; inventing square wheels. Gad.
Schema flexibility sounds like a great idea, but the only time it's actually useful is when the structure of your data has no value.

She's nicer about it than I am: the flexibility of xml, and its -ish cousins, is just pretty pink bows on chaos, aka illusory, which the coders have to work around since a hierarchy is rigid (while adding a layer, from a typist's point of view, is simple...) and to change it requires walking all the code which deals with the data at the point of change on down (at least). They don't mind, as it gives them more code to type. They like to type code. Real flexibility comes from Organic Normal Form™ schemas; since the data is stored in an orthogonal manner, changes to the schema are transparent to any existing data and code that are disinterested. With the cheapness of white box *nix machines and prosumer SSDs (Intel has just upped the ante, by a tad), joins are basically free. High 5™s all around.

Why chase unicorns?

No comments: