28 December 2010

Scotty's Wisdom

Message boards can actually be useful to the exercise of figuring out where an industry is going.  STEC is the principal publicly traded Enterprise SSD vendor, so it is the public bellwether with respect to "Enterprise SSD".  They've been segueing from SLC dominant to MLC dominant product mix, which ends up being a topic of discussion, especially recently.  A thread is running now about the qualification of an MLC version of STEC gold standard "Enterprise SSD" (ZeusIOPS).  I was moved to contribute the following:

"I canna change the laws of physics"

That will be true in the 23rd century and is true now.  The number of erase cycles of MLC is fixed by the NAND tech, controller IP can only work around it, usually by over-provisioning (no matter what a controller vendor says).  Whether STEC's controller IP is smarter (enough, aka, at the right price) is not a given.  As controllers get more convoluted, to handle the decreasing erase cycles (what?  you didn't know that the cycle count is going down?  well, it is, as the result of feature size reduction), SLC will end up being viable.  Cheaper controllers, amortized SLC fabs. 

If STEC (or any vendor) can guarantee X years before failure, then the OEMs will just make that the replacement cycle.  It would be interesting to see (I've not) the failure distribution functions of HDD and SSD (both SLC/MLC and STEC/Others).  Failure isn't the issue, all devices fail.  What matters is the predictability of failure.  The best to have is a step function:  you know that you have until X (hours, writes, bytes, whatever), so you replace at X - delta, and factor that into the TCO equation.

I think the failure function (in particular, whether and to what extent it differs from HDD) of SSD does matter, a lot.  Consumer/prosumer HDD still show an infant mortality spike.  Since they're cheap, and commonly RAIDed, shredding a dead one and slotting in a replacement isn't a big deal.  Not so much for SSD, given the cost.

I found this paper, but I'm not a member.  If any reader is, let us know.  The precis' does have the magic words, though:  Gamma and Weibull, so I gather the authors at least know the fundamentals of math stat analysis.  If only there were an equivalent for SSD.  It's generally assumed that SSDs are less failure prone, since they aren't mechanical; but they are, at the micro level.  Unlike a HDD, which writes by flipping the flux capacitor (!!), the SSD write process involves physical changes in the NAND structure; which is why they "wear out".  Duh.  So, knowing the failure function of SSD (and knowing the FF for NAND is likely sufficient, to an approximation) will make the decision between HDD and SSD more rational.  If it turns out that the FF for SSD moves the TCO below equivalent HDD storage (taking into account short stroking and the like to reach equivalent throughput), SSD as primary store becomes a value proposition with legs.  Why the SSD and storage vendors aren't pumping out White Papers is a puzzlement?  May be their claims are a tad grandiose?

The ultimate win will happen when MRAM (or similar) reaches mainstream.  Very Cool.

23 December 2010

Mr. Fusion Powered Database

I keep track of my various database interests with a gaggle of sites and blogs.  For PostgreSQL, I've signed up for the Performance sublist.  It's mostly about fixing things in parts of the engine in response to questions like:  "my query is slower in PG than it is in SQL Server; how come?", and such.

A thread started today that's of interest to this endeavor.  It started out with one person wondering why a Fusion-io drive runs so fast, but PG doesn't run any faster.  Then another chimed in to say he was setting up PG with Fusion-io drives.  Looks to be an interesting discussion.  Now, if only my employer would let me buy some of those Fusion-io drives!  BCNF to the rescue.

Here's the list.

The thread is titled:  concurrent IO in Postgres?

22 December 2010

Django Played Jazz

The PostgreSQL site has been linking to this blog a bit recently, and he's refreshing.  I'm going to spend some time looking into it.  It could be there's some intelligent life out there after all.

Here's the start of today's entry, if you didn't slide off immediately:

Don't retrieve a whole row just to get the primary key you had anyway. Don't iterate in the app; let the database server do the iteration for you.

And he signs off with this (my heart went pit-a-pat):

but far better is to make the database do all the work

It is shocking how many coders still insist on their for loops in code.  I mean, Dr. Codd made that obsolete, in the sense of providing an abstract declarative data model, in 1969/70 (the year depends on whether you were inside or outside IBM then).  In a few years, Ingress and Oracle were live.  I've concluded that MySql, PHP, java, and the web generally is what motivated the regression to COBOL/VSAM paradigms (that is, data is just a lump of bytes which can only be accessed through bespoke source code).  One didn't need to know much, and frankly most webbies didn't and don't, about data to build some snappy web site that just saves and moves gossip.  I suppose that's OK for most of the juvenilia that passes for web stuff, but not for the grown ups.

16 December 2010

An Once of Prevention

Andy Lester is a Perl coder, and I loath Perl, so there has to be a good reason for me to mention him.  And that reason is a posting of his, linked from Artima, which contains the following:

This person was one of those programmers who tried for the premature optimization of saving some typing. He forgot that typing is the least of our concerns when programming. He forgot that programmer thinking time costs many orders of magnitude more than programmer typing time, and that the time spent debugging can dwarf the amount of time spent creating code.

Now, when I was young and impressionable, the notion that a developer is paid to think, and not to type, was widely accepted.  But I've certainly noted that in recent years, java perhaps the culprit, lots o typing is now the metric.  LOC rules, even if most are worthless.  Moreover, development by debugging is also normative.  Ick.

What might this have to do with the point of this endeavor, you may be asking?  Simply that declarative data is so much lazier than typing, and that a BCNF schema is easy to modify (since there aren't covariances to worry about).  It does require some forethought, what Spolsky calls BDUF (you should look it up, if it's foreign), but that forethought isn't carved in stone, rather a strategic battle plan which permits many tactics.  The "Agile" meme appears to have eaten its children, in that its zealots really, really do believe that all projects can be built from daily hacks by masses of coders.  Ick; double Ick. 

15 December 2010

Pundit for a Day

I've been reading Cringely for decades, and especially, along with most who read him it turns out, his annual predictions.  Since leaving his PBS gig, he hasn't been doing them.  Sniff.  But today he announced that he would do another, and invited his readers to contribute same.  Well.  Not one to turn my nose up at the possibility of 15 seconds of fame (he allowed that any reader predictions would be printed with attribution, which he sort of has to do) I offered up what follows.

Just one, sort of.

I've been banging a drum for SSD for a number of years, at least since Intel released their flash version (in true Enterprise, Texas Memory has been shipping DRAM parts for decades, but that's another story).

When STEC, Violin, et al started to build "Enterprise" flash SSD those few years ago, the notion they promoted was that SSD would replace HDD, byte for byte.  That didn't happen, largely IMO because the storage vendors (SSD makers and storage OEMs) couldn't develop a value story.

There always was a story:  the Truly Relational RDBMS (not the flatfile dumps common in Fortune X00 companies which moved their COBOL/VSAM apps to some database) is (so far) the only thing which exercises the real strength of the SSD:  random IOPS.  But to get that benefit, you have to have a BCNF (or better) database, and join the shit out of it.  The COBOL/VSAM and java apps devs don't think that way; they love their bespoke written loops.

So, what we've got now is SSD as front end cache to HDD arrays.  And SSD as game machine and laptop speed up.  Enterprise hasn't yet bought SSD as primary storage.  Hmmm.

In 2011, we will see that.  My guess is Oracle will be the lead.  It works this way.  Larry wanted Sun, not for java or MySql, but the hardware stack.  What Larry needs is that last group of holdouts:  IBM mainframe apps.  To do that, he needs a credible alternative to the z machine ecosystem. 

He has that now, but it ain't COBOL.  He needs a value story to get those COBOL/VSAM apps.  Whether you buy that Oracle is the best RDBMS or not, Larry can make the case, particularly since his competitors (save IBM) have adopted the MVCC semantic of Oracle. 

Pushing highly normalized databases, with most/all of the business logic in the database (DRI and triggers and SP) running on SSD makes for a compelling story.  But you've got to spend some quality time building the story and a POC to go with it.  Larry's going to do it; he hasn't any choice.  And it makes sense, anyway.

Remember, a RDBMS running on SSD is just an RCH from running an in-memory database.  You don't need, or want, lots of intermediate caching between the user screen and the persistent store.  Larry's got the gonads to do it.

Robert Young

09 December 2010

Hand Over Your Cash, or I'll Shoot

The recent events led me to consider, yet again, the SSD landscape.  The point of this endeavor is to promote the use of SSD as sole repository/persistence for BCNF databases.  The reasons have been written to a great extent.

Since beginning this endeavor, there has been a clear shift in storage vendors' marketing of SSD, whether this shift was proactive or reactive, I do not know.  These days, there is much talk of tiering and SSD as cache, less talk of SSD as whole replacement of HDD.  Zolt, over at storage search still promotes the wholesale replacement angle, but he seems to be in the minority.  I stopped over to copy the URL, and there's an interesting piece from 7 December worth reading, the column headed "MLC inside financial servers new interview with Fusion-io's CEO" (the way the site works, the piece will likely be hard to find in a couple of weeks, so don't tarry).

So, I reveried into the middle of a thought experiment:  what difference, if any, does it make whether an SSD is used as sole/primary store or as cache?  Well, I concluded that as cache, dirt cheap SSDs are just as good as Roll Royce SSDs (i.e., STEC, Violin, Fusion-io, and the like) from one angle, for the simple reason that the data on the SSD is really short-term.  From another angle, those cache SSDs had better by really high quality, for the simple reason that the data is churned like a penny stock boiler room, and SSDs need robust design to survive such a pounding.

The OCZ contract announcement leans toward the first answer; $500 doesn't buy much (if any) STEC SSD.  With error detection and hot swapping in the array, just pull them as they die and toss 'em in the shredder.  I'd unload any STEC shares real soon now.  There'll still be full-blown SSD storage for my kind of databases, but the American Expresses are more likely to go the front-end caching route (they've no stomach for refactoring that 1970's data), and for that implementation, commodity (this soon???) SSD is sufficient.

Like a Rolling Stone

One Hit (To The Body), STEC's and Compellent's that is, appears to have happened today.  And, I'll 'fess up, I never saw it coming.  OCZ, which had looked like a mix of prosumer/consumer SSD builder, is now shipping Enterprise parts.  Who knew???  And the stated price is $300-$500.  Either some OEM is willing to take a really big chance, or the cost of Enterprise SSD just went over a cliff.

Here's hoping for the latter.  Do you get it?  SSD for only 2 to 3 times the cost up front!!  Not the 10 times (or more) it has been.  And if the buyer is EMC????  STEC's corporate sphincter just got puckered.

The devices are the Deneva line, which they only announced back in October?  They run SandForce's SF-2000 controllers, and will be shipping by February!!  "Watson, the game's afoot."

As I was composing this missive, came word of the Compellent smash, and it's not a technical problem.  Compellent is an early adopter of STEC SSD.  Months ago, you may remember, a rival, 3PAR, was the hockey puck in a takeover game.  Compellent's share rose, a lot, in sympathy, and kept going.  Today's story has it that the company is going to Dell, but for substantially less than the share bid up by all those plungers.  Irrational exuberance strikes again.

08 December 2010

Simple Simon Met a Pie Man

Another case of an interesting thread and an interesting post (mine, of course).  And, once again, it's from Simple Talk; on this thread.

Since you mentioned it, I'll beat the drum, yet again, for the necessary paradigm shift (in many places, anyway) which small keyboardless (in practice, even when one exists) devices. 

- Mostly, it was seeing that the best existing tablet and smartphone apps do simple, intuitive things, using simple intuitive interfaces to solve single problems.

As I've said here and elsewhere for some time, by (re-)factoring databases to high normal form (narrow tables, specifically), one gains a number of advantages.

1) such schemas are inherently candidates for UI generation, due to DRI
2) they're inherently robust, due to DRI
3) they're likely most of the way to being "pickable", which is what tablets do
4) given the ability to host high normal form databases on SSD, then building them to such a UI is feasible

Tablets have a long history, in fact; the iPad is nothing new, except for its venue.  Those doing VAR systems that work in warehouses have been writing to RF tablets for a couple of decades, and designing to high normal form (or, as often, working around its absence).

07 December 2010

Pink Floyd

For those following that other story, what's Larry up to with Sun, I've been in the it's-about-taking-down-the-mainframe camp from the first nanosecond.  It's been kind of a small camp, amongst the Usual Pundit Suspects.  But today comes a bit of news along that line of thinking.

It's becoming clearer that Larry wants a database machine that can slurp up all those renegade IBM mainframe folks.  He knows he's got to get them off COBOL, somehow too, but first he needs a credible stack.  Another brick in the wall.

04 December 2010

Tin Man

I've met the Tin Man.  Whilst looking for some MVCC/Locker debate I happened onto a Sybase Evangelist blog.  Kind of like what I do, but he gets paid for it.  Sigh.  May be soon. 

Anyway, this post is his paen to SSD, and this:
How big is your database?? **light bulb** Those same 10 SSD's get you a whopping 300-600GB of storage. You could just put the whole shooting match on SSD and forget the IO problems. Rep Server stable queue speed issues - vaporized.

Be warned, he makes the leap from Amazon sourced SSD to enterprise database storage (he doesn't mention STEC or Violin, for instance, but appears to be aware of earlier Texas Memory DRAM units); not really going to happen, so his arithmetic is off by a decimal point.  But otherwise, he and I are on the same page, especially with skipping the "cache with SSD" silliness, and just storing to SSD.  Schweeet.  And he knows from schmere.

Now, hold Dorothy's hand.

02 December 2010

Thanks for the Memory

AnandTech has an article about 25nm flash today.  Well worth the read.  I'm not sure how this affects this endeavor.  On the one hand, the physics of smaller feature makes flash less worthy of enterprise storage.  On the other, the increased density supports greater over-provisioning to solve, maybe, the problems.  A classic engineering problem. 

I stopped by Unity Semiconductor to see if there's any news on shipping of their "new" device.  They include this article.  If it works, SSD has some calm seas and wind at its back.