Dr. Codd Was Right: October 2010

21 October 2010

Soft Core Porn [UPDATED]

You are the Apple of my eye, although I'm no fan of Stevie Wonderful. There's been miles of type devoted to Steve's recent announcement that the next version of Mac portables will run only flash storage (which may or may not be packaged as a SSD). Here's the Journal version.

Not that I'm much of a fan of the Journal, of course, but the story carries this wonderful quote:
"The market is moving from hard drives to flash much faster than it was expected six months ago," he says.
He, in this case, is Philippe Spruch, chief executive of LaCie.

Follow that Yellow Brick Road.

[UPDATE]:

Here's a Forbe's article. The answer is as I expected:

Not surprising, however, is the fact the SSD isn't really an SSD per se--it is simply NAND Flash on the primary circuit board paired with a "SSD controller." This is one of the points I made last year in a post addressing Stec Inc. (STEC). Due to the fact that it is much more economical and takes less space to "roll your own" SSD (put Flash chips on a board with a controller), there will eventually be a very limited market for SSDs that duplicate the form factor of a HDD - PC companies and even enterprise storage companies will simply buy SSD controllers and NAND Flash chips. The point here is the controller will be the differentiating chip and that takes us to Marvell Technology (MRVL).

18 October 2010

sTuple, Part the First

I've had an abiding affinity to ERP/MRP database applications for rather a long time. They're the prototypical application which will benefit from BCNF datastores, and thus SSD infrastructure. Too bad, too; they've mostly been around since the 1990's (and earlier, if you count System 38/AS 400/iSeries/whatever-it's-called-now) and their owners aren't much interested in rebuilding. But there are possible exceptions, a few open source ERP applications exist. One that's stateside is xTuple; and, although not Open Source as the Community prefers to use the term, there is a semblance of a codebase/database from which an application can be implemented. The source is available here.

xTuple is Open Core/dual license as MySql, so I won't be addressing the full ERP horizon, just the OS version. Some regular readers may remember a post or two when the iPad was released, dealing with the notion of tablet based applications making new headway. Tablet computers have been around in the ERP/distribution world for decades, but as rather expensive specialized devices. The iPad (and what is asserted to be a soon-to-be flood of similars) opens up application development based on picking-not-typing to a much wider world. How many of these application developers actually embrace the freedom (some may think of it as a strait-jacket) of picking to re-factor their databases, rather than attempting to just stuff existing screens into a smaller form-factor remains to be seen. (Aside: when Windows took hold, the term Screen Scraping was born, thus ushering in myriad DOS applications, nicely pixilated and creating the adage "Lipstick on a Pig", foisted on an ignorant public.) As it happens, an xTuple affiliate, Paladin Logic (gotta love that name), has modified xTuple for the iPad. Since I don't have an iPad, or any need to run an ERP application, I haven't bought iTuple; thus I don't know what tack was taken.

In any case, Paladin Logic has demonstrated that some degree of morphing is possible; although not likely by a lone developer. xTuple's sorta-kinda rules don't permit modifications to the base database, without those changes being approved by xTuple for inclusion in the base product. As a result, my interest in xTuple hasn't been reciprocated, since my interest in signing on as a consultant would have been to work on transforming to a BCNF form on SSD. xTuple runs on PostgreSQL, which while a much more worthy open source database than any other (certainly more so than MySql), it doesn't have the knobs and switches present in a true strength industrial database. My preference is DB2.

So, in the spirit of xTuple and iTuple, I begin a journey to sTuple: xTuple on SSD. I'm not going, by any stretch of the imagination, to attempt anything more than a one-off POC. But what the heck, you're getting it for free.

There are a number of reasons to prefer DB2 (the server version, most often termed DB2/LUW; the LUW stands for what you think it does) for pure client/server OLTP applications, which is what xTuple is. For eCommerce/net types of applications, a case can be made for MVCC semantics databases being superior. Oracle and PostgreSQL are the two main proponents of MVCC. I'll note that SQL Server and DB2 have made stabs at imitation. In the SQL Server case, Microsoft has implemented Snapshot Isolation, which is MVCC-lite; while IBM has added some "compatibility mode" syntax munging but still on top of its locker engine.

So, what I've been considering is whether it's feasible to implement xTuple on SSD with DB2. From a cost point of view, for the clientele targeted by xTuple, it's likely a wash. IBM makes available a fully functional, up to date, DB2/LUW. It lacks some of the more arcane stuff, like LBAC, but is otherwise complete. There is a two core/2 gig resource limit; again, for the SMB world, and a BCNF (which is to say, minimized) schema, that shouldn't be a problem. Support is available for the same ballpark cost as PostgreSQL consultants' support.

The last bit is the most iffy. xTuple, and the main reason I got interested, implements the "business logic" in the database. Because this is PostgreSQL, for various reasons, this means lots o stored procedures, and some triggers. Well. One of the reasons given by coders for ignoring 90% of the facilities of a RDBMS (treating it as a file system store) is that it is impractical to port from one database to another, especially where triggers and functions and stored procs are used. Not so fast, buckaroo.

Some years ago, whilst toiling in a Fortune 100 (well, an out of the way minor group) one of those coders asked me about moving from DB2 to Oracle; was there any help for that? Turns out there was, and is. The company/product is called SwisSQL. It's been around since the mid 90's, and its purpose in life is to provide a translation application. It translates schemas and database code. It's not open source, no surprise there, but it does offer a limited time evaluation download. Since that's 30 days, I'll hold off on getting it until I'm ready to go. According to the site, it doesn't offer a PostgreSQL to DB2 code translation, but does for Oracle to DB2. We'll see just how close to Oracle PostgreSQL has gotten; it said to be quite nearly identical. The biggest issue, from that page, is outer join syntax, but that should no longer be an issue since Oracle has supported ANSI syntax since (from memory) v10, thus current SwisSQL should recognize the syntax.

I plan on looking for one, or perhaps, two cases of gross denormalization, and refactoring to DB2 and testing. For that, I'll skip the schema/data translation offered by SwisSQL, which does include PostgreSQL to DB2. From there it's getting the "business logic" ported. That can take two forms. One is to just translate the triggers and procs, the other is to implement the logic with DRI. The whole point of SSD for RDBMS, so far as I'm concerned, is to replace as much code as can be done with data. That means tables, perhaps a slew of auxiliary tables. Joe Celko wrote this up in "Thinking in Sets", so I'll not make a big deal of it here. Well, other than to say that using a One True Lookup Table (mayhaps more than One) implementation isn't out of the question using DB2 on SSD.

While I'm at it, in this preamble, a word or two on why DB2 is better for this sort of application. There is but one decent dead trees DB2 book, Understanding DB2 ...", which has a chapter devoted to each of the major structural components of any database engine: the storage model, the process model, and the memory model. There's a lot of meat in those pages. The on-line doc site doesn't replicate those chapters, alas. Here's the storage section. The short answer is that DB2 allows one to associate a (set of) table with a page size, a tablespace (set of tables/indexes), and buffer memory. So, one can have a group of auxiliary tables supporting data constraints which are pinned to memory. Very useful, that. PostgreSQL, and most other databases, use globally allocated buffering, which leads to buffer flushing when you least want it. PostgreSQL supports only one page size per installation; it's set when you compile PostgreSQL.

This can all work, as the iTuple implementation shows, because the database store is agnostic to its clients. As it happens, stock xTuple client code is a Qt implementation, which is to say C++; not my favorite or oft used language. In theory, at least, xTuple clients could be java or PHP or COBOL (OK, the last was a joke); any language with a client binding to PostgreSQL to call the procs. For these purposes, I'll just be using Aqua to access the data initially. If I get motivated to re-factor an entire client request/screen (don't know yet), then the existing client won't know the difference. That assumes that this can be isolated enough, since the rest of the code will still be looking to a PostgreSQL database. No promises on that part.

Off on the Yellow Brick Road.

07 October 2010

The Sands of Time, Much Less of It

Bloody Hell. Sandforce have announced the specs for their next generation controller(s), the SF-2xxx series. And, our friends at AnandTech have published a short "review" of the spec.

This is getting bloody ridiculous. The SF-2xxx specs out better than the STEC Zeus, by a long way. The Zeus has been considered the creme-de-la-creme of enterprise flash SSD. The SandForce controller, if it works as described, puts STEC in a bind, and the enterprise SSD buyer in the driver's seat.

And then there's this, from a Reg article:
"Barry Whyte, an SVC (SAN Volume Controller) performance expert and master inventor at IBM, thinks the previously standard 15,000rpm 3.5-inch disk drives could vanish from enterprise array's performance tier in 18 months."

Come on folks, BCNF databases are yearning to be free, free I say. We must do this.

06 October 2010

I Hear You Wanna be A RAP Star

The intent of this endeavor has always been to demonstrate that the relational model, when implemented in a BCNF database, is not only the most efficient storage model, but also the most robust. One of the side effects of BCNF databases is that they are ideal for code generation. After all, catalogs are simply plain text (and with none of the noise level of xml) and thus sources for text munging.

Today another code generation framework came by. While it doesn't make any BCNF claims (that I can find, at least) nor does it promote SSD as appropriate to its implementation, both fit precisely.

Here's the web site and over here is a White Paper describing the use of the system. Since it's not only SQL Server but also implemented as a Visual Studio project, and I'm on linux only, I've no way to test it. I've gone through the White Paper, and it does look interesting. I will note that the silliness of naming all tables with a prepended "TB" is a tic we should all have outgrown by now.

There isn't any mention of parsing of either check constraints or stored procedures in the generation of the UI. The check constraint bit shouldn't be any more difficult than the foreign key business, although the stored procedure wouldn't be a walk in the park. With a BCNF catalog, one could define lookup tables as real tables as foreign key targets. An SSD stored database would make that same, same. For that matter, using a database such as DB2, where one can define separate tablespaces/bufferpools, such (I'll assume small) lookups won't be heavily modified and can be assigned to a dedicated bufferpool of sufficient size that none of the rows will be evicted. Done, go home.

05 October 2010

Jingle Bells

Santa's early this year. Well, the notice of his sleigh contents anyway. Here's the next X-25M spec, courtesy of AnandTech. The magic words: 25nm process, and a "power safe write cache", and 600Gig. Woo hoo.

Lead, follow, or get the hell out of the way. The BCNF database is comin' through.

Dr. Codd Was Right

Make America White Again - The Gang of Six, 29 April 2026

About

Shameless Plug

Extended Pieces

Good Stuff

Followers

Blog Archive