26 September 2009

We've Seen This Movie Before

Many of us have seen this movie before, if you're of a certain age (or had instructors who are). It's a cross between "The Return of Fankenstein" and "Groundhog Day". The theme has arisen with some frequency in the last few weeks, on Artima in particular.

The movie is scripted: it's all well and fine for you to talk about SSD changing the rules, but we've still got to write our applications for the normal HDD environment; not all clients will have gotten the SSD revelation. I knew this sounded familiar, but it took me a while to put my fingers on it.

In the 1960's, and more so the 1970's, the magnetic disk subsystem (IBM terminology then) began to proliferate. But it didn't instantly replace the 9 track tape drive. In the IBM world, COBOL was the language (along with some Assembler and that mutant PL/1) of choice, and had been in use for a decade by the time the 370 took over. The I/O calls had been written for sequential access (3 tape sort/merge update was the paradigm) from time immemorial.

The result was that COBOL continued to be written to a sequential access method, even though random access was the whole point of the disk drive. Files on disk were imaged as if they were on tape. The reason was simply convenience to COBOL maintenance coders. Even new applications tended to do the same; inertia is a powerful force.

Hardware paradigm shifts often are captive of software inertia. SSD is not the only one now. The multi-core/processor machine presents problems to coders; in greater magnitude than SSD. Here's Spolky's recent rumination. The money quote:

Sure, there's nothing officially wrong with trying to write multithreaded code in C++ on Windows using COM. But it's prone to disastrous bugs, the kind of bugs that only happen under very specific timing scenarios, because our brains are not, honestly, good enough to write this kind of code.

Whether multi-core/processor machines will ever be useful to application programs, by which I mean guys writing the 1,023,484th General Ledger for Retail Sales, remains up in the air. The guys writing operating systems and database engines will have much fewer issues; they've been in the multi-threaded world for decades. We should let them go ahead and do their thing. Let the database engine do all that heavy lifting, and leave us mere mortals to decide which widget to use and what the schema should look like.

On the other hand, making the transition to SSD store and BCNF schemas will require slapping down hidebound application coders who wish to remain in the far past. I see a future where applications which have limped along, structurally, unchanged since the 70's finally being replaced with small high normalized databases. It will be just too cheap not to. A system based on a few dozen SSD will replace those geriatric pigs with thousands or more HDD. The ongoing cost difference (TCO, as they say) will easily be greater than the amortization of build costs.

For those geriatric pigs which were built somewhat more recently, and built around stored procedures rather than application SQL, will have a better chance of survival. All these codebases will need is the schema refactored, and the stored procs updated. The client application code wouldn't change; the proc still returns the same bloated data, alas.

If you go to the concession stand, I'd like a large popcorn and Dr. Pepper. Thanks.

No comments: