13 May 2012

Meet Forrest Gump

How often is one left speechless? At a loss for words? Well, in today's NY Times has a story that did so. Back? OK, what would you think of a physician who was surprised that antibiotics or surgery works? Well, that's the level displayed by this knucklehead; described as an actuary and math professor. And I quote: "We started with simple calculations and moved on to more involved ones. To me, the results were astounding: statistical sampling worked." Yeah, right! First semester sociology undergrads taking their watered down baby stat course may be. Further evidence that quants in financial services may be more the problem than the solution.

So, then we read this: "As far as I knew, no one had proposed this model." There's not enough detail to know what that model is, but the fact is that sampling from relational database engines has been around for a long time. It's not as straightforward as pulling balls from an urn, of course. Rows may, or may not, be stored in some key order. They may be stored in insertion order. If stored on SSD, in particular, they'll be scattered thither and yon on the silicon, which actually makes the process more efficient. Moreover, on what attribute(s) should randomness be enforced?

To make matters worse, he states: "I believed I had a solution to this cumbersome and costly process: create subgroups from the database, sample policies from each, repeat the process several times, then combine the results." If that's not stratified random sampling, prior art up an elephant's butt, I'll eat my hat. And yeah, sampling has been done in RDBMS for a very long time. Here's a SQL Server 2000 version. And here's a very long thread on sampling from 2005. Moreover, TABLESAMPLE has been a SQL standard for about a decade, although not implemented by all engines for that long. The notion that one can patent sampling will cause Snedecor to rotate in his grave. Please!

Whether there really is anything unique, and therefore patentable, here is impossible to say based on the text. While there remains a good deal of grey, algorithms aren't/shouldn't be patentable. One man's considered opinion. Given that Apple got patent protection for a rectangle (admittedly, in Germany), my guess is yes, and that's very too bad. They may actually lose.

(My taste in allusive puns as titles may have stretched as thin as a spider's thread this time, so: Gump made the remark about a box of chocolates (I've never watched the film), and at that time in the USofA, if one was of the lower-middle class, then that box was most likely a "Whitman's Sampler". Mea Culpa.)

