15 March 2022

In The Mail

For better or worse, I keep getting come-ons for silly 'database' stuff. Got two in a row so far this week, and they both relate to 'relational' databases. Dr. Codd would be so depressed.

The first is from O'Reilly (the Tech site, not auto parts). It tries to make a case for a cross-SQL database stored proc language, although the author doesn't phrase it quite that way.
So database management systems and SQL, the language that is used to interact with them 99 percent of the time, were one of the very first applications of computers.
[it gets sillier from there]
That, as any reader here knows, is total bullshit; but kind of expected from the Kiddie Korp. To review: first there were sequential files, then there were indexed sequential files, then there were 'relative' files (all IBM jargon), then there was the quasi-relational database IDMS (the implementation of the network data model), then there was IMS (the IBM response with a purely hierarchical data model, damn you xml), then there was the Relational Model. That last was first implemented, in a commercial setting, not by IBM by what came to be known as Oracle. It wasn't nearly complete in any kind of meaningful faithfulness to the RM, but it was very different from hierarchy, and didn't do all that much. In typical Rightwing fashion, the first release was name 'V2'. Not kidding.

He then spends the rest of the interview lobbying for 'portable' non-SQL syntax; as if that were feasible. SQL, while far from what Dr. Codd expected of a database language (see: most anything by Chris Date on the issue), has, from the beginning, such a malleable spec that most anything any vendor called SQL qualified. The first 'standard' did explicitly that; grandfathered any product that asked. Moreover, so far as Codd was concerned, he didn't give a flying fuck how vendors implemented the RM, and said so. Not with the flying fuck syntax so far as I know. The point of the RM was that it is a user-level logical model of data, not a spec for engine construction. And, very importantly, the data language would work with any engine. That part hasn't worked out quite so well. Part of the problem is that most data management tasks are outside of the SQL committee purview, and are engine specific.

Over time, vendors saw demand for, and attempted to implement, additional data manipulation algorithms. The author gnaws on analytic functions, but there are bunches. Eventually, he makes mention of the fact that SQL database engines are proprietary entities; even the open source ones. And there remains the bifurcation between locking engines and MVCC engines, which affects how non-SQL procedures get implemented.

Toward the end, he allows:
Well, open source code was an absolute revolution in software development, so the same thing could happen for SQL developers — it could be a catalyst.
Of course, there's a vast difference between an open source coding language, like C, and an open source SQL engine; of which there are essentially two, and so far as I can tell, they don't interoperate all that well. Adding user-level syntax is doable, if not very fast, by offering it to the ANSI committee and, may be, in a few years it is added to the standard. The notion of C level analytic functions, as an example, that can be called by any SQL engine is absurd. On the other hand, the syntax of any desirable user level analytic function, as an example, can be codified. It's just not a walk in the park.

The other piece of mail was a job advert, again so it said, for a 'Data Scientist / Data Analyst'. Mentions SQL rather prominently
SQL expertise (including complex joins, grouping, aggregation, nested subqueries, and cursors)
Nested subqueries and cursors (damn your eyes) are, more or less, tied to the engine in question. Which, naturally, the blurb doesn't mention. And, really, would any self-respecting relationalist cower to cursors?? C'mon, man!! The Hell of row by agonizing row?

No comments: