26 November 2013

Life and Death on the Nile

One might ask, "why is it that CMS would opt for a twelve year old, still private, still VC slurping outfit to 'run' healthcare.gov?" Why would such a tactic make sense? I mean, they could go belly-up at any moment; although having insinuated themselves into CMS might get them to Too Critical To Fail status. Well, the answer appears to be, inertia.

More than a couple of decades ago I worked for Optimed Medical Systems (no longer exists separately, and last I knew had been inhaled by a competitor after stops elsewhere). This was a Progress shop, and was known for the cadillac pre-qualification software of the time. VT-X00s connected to *nix and the Progress "database" run C/S in-a-box style. We made fitful attempts at a sort of AI module. Prolog would have been the proper language, but that's another episode. HL7 was then relatively recent, and beginning to be widely adopted.

What I found odd, and never reconciled, was the medical community's love affair with hierarchy. The embodiment of this affair is HL7. The Wiki article is reasonably comprehensive. Here's a cautionary tale. IBM flowchart support diagrams, (an example on page 29 that looks just like a pyramid!) dating from the 1950s, are perhaps the earliest existing example of hierarchy blessed by The Smartest Guys in the Room. Dr. Codd faced significant opposition, and IBM paid him back by appointing an IMS cowboy (Chamberlin) to devise the query language.

As with any hierarchy biased datastore, finding *relations* among all that data is hard. And it's relations that we care about. R folks, and quants generally, take a probabilistic view of data relations. There's X% of people who buy diapers and whiskey together, so let's use an animated baby in our ad for Uncle George's Fine Moonshine. And get Uncle George stocked in grocery stores, and diapers in liquor dispensaries. RDBMS folks view relations as deterministic: order lines *will have* an associated order. The pyramid folks get all bent out of shape with this kind of assertion, blithely ignoring that their hierarchical datastore forces all data into *but one* deterministic structure. Some people...

From a diagnostician's point of view, the IBM 1401 flowchart (or its railroad equivalent) establishes a reasonable model for decision making: is the patient breathing? And so forth. As a datastore "model", not so much. The medical community had a serviceable hammer, and made all else into nails.

So, in a world of folks who insist that all data be pyramids, forcing MarkLogic on the developers isn't all that surprising. As usual, policy beats the crap out of data when there's a disagreement.

No comments: