31 October 2009

Dahling, You Have to Read This

Shooting fish in a barrel, and criticizing xml, are congruent. The difference is that shooting fish is a one step process, while dealing with the zits of xml is a life long ordeal. Thus, I've only made a few random comments. The effort to build a new coherent jeremiad simply didn't feel worth the effort.

Then I ran across this. While I think he gives xml too much credit, in that "documents" intended to be consumed into my beloved BCNF database really don't need the structure (metadata information). A csv file will do. Pascal has written about that years ago. He was right then, and he's still right.

What I find mind boggling is this quote:

It's pretty popular these days to kick XML, all the cool kids are doing it and they don't seem to discriminate between its 'good' purposes and 'bad' uses.


I guess I'm not lucky enough to live in his world, 'cause where I've been, xml-ness is still au courant. XML Spy, ACORD, and the like remain in control. I need a new life!!

29 October 2009

Double, Double Toil and Trouble

There's been a hurricane of consternation on the Yahoo! (and I strongly expect, other) message board for STEC, our resident poster child of high performance SSD. The share has cratered from its high ~$40 to ~$20 yesterday. Today it's up a bit. Message boards have some interest, since a small fraction of the stream is intelligent. No, I don't hold any STEC, nor do I care whether they get rich. I do want them, or a company doing what they do, to continue.

One piece of intelligence is reference to this blog, at IBM. It doesn't navigate very well, and I'm sure not going to regurgitate it here, except to say that IBM has, if this fellow speaks for the company, backed off somewhat from the PCIe (Fusion-io) approach and reverted to straight SSD; STEC according to what he has written. The posting of interest talks about why PCIe went away, and SAS drives came back.

OK, one quote, from the 18 Sep entry:

Another interesting side note can be seen when you add the areal density of silicon, which to this day has tracked almost scarily to Moores Law. If for example, the GMR [giant magnetoresistive] head had not been invented by Stuart Parkin, then we'd probably have had mainstream solid state drives in the mid 90's. If nothing else comes along to push spinning rust back to the heady days of 65-100% CAGR[compound annual growth rate], then by 2015 solid state density will overtake magnetic density - in a bits per square inch term.


Now, this is important. The transition from HDD to SSD has changed, based on public pronouncements from the vendors, since January of this very year. Up until then, the notion was that HDD arrays would be replaced by (smaller unit count) SSD (mirrored?) arrays. The higher cost of SSD would be mitigated by the lower unit count resulting from not requiring striping (and possibly, mirroring) units in the array. (Aside, and from refactoring databases to BCNF; thus jettisoning at least an order of magnitude of those bytes.) Now, the notion being promoted is "disk caching", with some small number of SSD fronting the existing HDD array. I'm still not convinced this makes much sense, but there you are. The major impact of this approach is to simply not deal with data bloat, and thus get maximum benefit from SSD, but to garner "good enough" improvement.

If we are headed to a cross over in data density (but not necessarily cost/bit), then preparing for, and building now, pure SSD systems isn't implausible. While I don't relish the "disk caching" approach as the end game, if it serves to jam a size 12 brogan in the door; I'll take it.

27 October 2009

Swimming the Amazon Upstream

News from the jungle. Amazon is now offering an explicit relational database cloud service, RDS. It is MySql 5.1, so calling it "relational" is a bit of a stretch, but it is some good news.

I had the temerity to suggest that they offer SSD storage as an explicit option, so that real database wonks can build real BCNF applications. Got back a personalized auto reply, which said that "An AWS representative will reach out as soon as possible to address your questions about cloud computing and Amazon Web Services." I am on tenterhooks.

23 October 2009

Follow the Yellow Brick Road

An update on where I think this SSD/multi-machine/BCNF database journey is going. The independent storage vendors are adopting a "tiered storage" paradigm, with SSD/HDD machines and embedded controllers aimed at "enterprise" clients; large servers (quasi mainframes) and mainframes. My conclusion is that their clients don't want to refactor anything, just keep that 30 or 40 year old code and make it run a bit faster. Such machines adopt a disk caching approach; I'm not convinced that there is much point to having live data on multiple varieties of persistent storage. (Aside: in the *nix world of databases, it is understood that the database machine should have *nix file buffering turned off so that the database can do it's thing, faster. The same applies to persistent storage. They'll all find out, eventually.) So, who's going to adopt SSD/BCNF machines outright?

Well, back in the late 1980's, the VAR (value added reseller) emerged as a conduit for *nix database companies. The archetypes were Progress, Uniface, and Informix (PowerBuilder in the M$ world). They provided a "relational" database and a 4GL to manage it. In the case of Informix, it was/is a real database with proper SQL access. The others stressed their own 4GL over SQL. They aimed at developers with business expertise to build replacement applications for hoary minicomputer (mostly; some mainframe) file based ones. Since these folks weren't burdoned with legacy "stuff", they had a reasonably clean sheet.

This venue is where I see the entrepreneurial force to accept the "new" idea of declarative data, minimal code. The VARs I've dealt with (or worked for) always supplied the machine, too. A source of profit, but also met a strategic need: support of clients is *so* much easier if you already know more about the full system than the clients. Shipping out the full system with a couple of STEC drives (or Fusion-io, or...) is not a client decision. That, fact is, has always been true. I still keep getting calls to work on COBOL code that's decades old, when it is clear that such organizations are desperately trying to keep from sinking in the tar pits (like the metaphor?), without having to do any real thinking or engineering. The other area is a gawd awful lot of SQLServer. I don't get that at all. Not all change embodies progress, but all progress requires change. You can't make an omelet with breaking some eggs. Doing the same thing over and over again, and expecting a different result, is a definition of insanity. The first step in fixing a problem is to admit that the problem exists.

Now, I just have to find one. And, no, I don't have in mind any application that I just have to create; I care about the tech, not the application.

16 October 2009

Shift in Focus?

In a few short months, there has been a shift in focus with regard to Enterprise SSD. I don't yet have a conclusion whether this is a good thing or bad thing, but I suspect the latter from the point of view of this endeavor, the BCNF relational database. What has emerged is the "disk cache" notion, with a few SSD (STEC or similar) fronting the HDD array. The Sun/Oracle FlashFire is the most recent example.

The controllers (SSD and HDD) work in concert to move data from the HDD to the SSD, just in time. The downside, from my point of view, is that this bandaid measure is a sop to those existing bloated file based (read: xml) applications. From the desire for immediate gratification, I suppose this is OK. But doing so squanders the big value of SSD. The vendors would be just as well off, if not better off, using static RAM caches.

We'll just have to wait and see how SSD plays out. If the "disk cache" approach becomes the norm, then the enterprise SSD (and its effect on the relational database) shrinks into niche status. The would break my heart, tender young thing that it is.

12 October 2009

Can Somebody Please Make Up His Mind?

A double dip day.

This has gotten out of hand. First, I read that the Sun/Oracle FlashFire gizmo is a superDuper SSD. And I pass that on, with my Value Added commentary and glee. Then Larry says, "No" it isn't. It's just a bunch of NAND as cache. And I fess up to an error and slink back to my lair.

So, today The Register runs a story with the details... wait for it. FlashFire is a superDuper something like an SSD; although not called that exactly. But it is populated with STEC parts. My head is spinning, lights flash, getting sooooo dark.

Whatever. The salient fact is that SSD and databases are taking over the world, just as I've been predicting. I want to be First Chancellor of Normal Form. No database can leave the crib without a proper bris; all extraneous fluff removed.

Not A Cloud Was in The Sky

I've been saying all along that The Cloud is a Crock. Well, here's the latest in the saga. You should go and read the story; I won't cut-n-paste it here. I will gloat, however. Imagine what's going to happen when the BigMegaCorp leaves its data in the hands of MicroSoft? Same thing.

Well, maybe one quote:

Microsoft said in an emailed statement that the recovery process has been "incredibly complex" because it suffered a confluence of errors from a server failure that hurt its main and backup databases supporting Sidekick users.


Going to The Cloud is a dereliction of duty, pure and simple. Losing one's phone numbers is one small step for ineptitude, but it should not be one great fall of mankind. Your data is yours. Do not give it away to gain a penny here or a penny there.

04 October 2009

Da, Comrade, All You Need is Black

I was just over at the Yahoo! STEC message board, attempting to bring some sense to those folks. A heavy burden, but someone has to do it. There was a thread from August, which got restarted, which tried to find justification for SSD, STEC's in particular, in cloud computing.

I disabused the poster, pointing out that cloud is all about scads of plain vanilla resource; disc in this case. A cloud provider may not even tell the client what resources types are being provided, only CPU seconds, gigabytes of disc, and the like.

Then it hit me, the light went off, the Red Sea parted. The cloud is the implementation of the Soviet Model: "What, you want a suit that's not black? You don't need blue. Black will do". It's such delicious irony; the titans of Capitalism implementing Soviet era Communism. I'll sleep better tonight, secure in the belief that American corporations are content to be told what to do by their vendors. Ah, sweet justice. Can the pogroms be far behind?