29 September 2011

Pretty as a Picture

Along with an interest in stats and graphs comes a level of responsibility. Kind of, guns don't kill people, people kill people. The canonical text is "How to Lie With Statistics", which was first published in 1954. Legend has it, it's never been out of print. Likely so.

It so happens that I've found a couple of blogs/sites which both deal with graphing stat data in non-disinterested ways. I'll note once again that a stat/quant/analyst/foobar is supposed to be disinterested. S/he's just an impartial judge of the data, trying to scope out the real relationships in the data; if there are any, there may not be. Data associated with politics is particularly susceptible to bias. But others face the same pressure. Worker stat bees (having been there) are often encouraged to slant the presentation in a way to make the nappie marketing Suits look like geniuses. It's a problem everywhere; all worker bees are expected to behave as attorneys; staunch defenders of whatever the Suits have done.

Watching the response to drug clinical trials is particularly amusing. Rather often, the sponsor will be shocked (shocked, I say) that its new FooBar Resolver didn't blast the .05 requirement out of the water. There'll be "unexpected placebo levels" or "unbalanced randomization" or "the FooBar Resolver patients were sicker than placebo". And so on.

Be that as it may, here are a couple of sites worth grazing:
The R Graph Gallery, from Romain François
The Gallery of Data Visualization, from Michael Friendly

27 September 2011

Figures Don't Lie, But Liars Figure

I just found this link, which says it all (well, most all) about lying, stats, and graphs. It's only a bit beyond 5 minutes. Time well spent.

25 September 2011

It Ain't The Meat

Back in the 70's a married lady (but not to me) of my acquaintance had a preternatural affinity for the Maria Muldaur song, "It Ain't The Meat It's The Motion". Nothing to do with me, I'll warrant. It was the 70's, of course, and it means what you think it does. Still true today, but the context relevant to this endeavor is a bit different. Welllll, may be a whole lot different.

One of the neat aspects of R is the ability to talk to most any other application, and vice-versa. R is, justly, known for the support for graphical display of statistical data. googleVis is an R package which links R to the Google Visualization API, empowering "moving" data in an html page. I've not played with it yet, but here's a sample from a blogger who has. Yet another case where the R community builds spectacularly useful widgets for the rest of us to exploit. Who said open source is anti-American communism? For the record, at least little Darl.

A few years back, I was involved with Business Objects, building dashboards. But using BO requires building a shadow schema of the RDBMS it talks to, and runs as its own application; generally a pain in the butt. With R, and PL/R with Postgres, one can drive the data and statistical analysis applications through the database. With googleVis, one can create animated graphs into the browser. Very cool. And his talk was on my birthday. Damn, I missed it.

The advantage of moving graphics is that this is a way to display higher dimension data; using bubble charts and animation, we get four dimensions, the bubble size and the motion axis (classically, time).

There are other plotting packages, beyond the base plot() functions, but I'd be willing to say that googleVis is the least difficult of the bunch. It does mean that it's for browser applications.

19 September 2011

Newest Meme: NoClient Database

What's that line from "Network"? "I'm as mad as hell, and I'm not going to take this anymore." Such is my view of NoSql nonsense. I'm not quite as mad at client coders who want to rule over the database, but close.

So, it was heartening to read Dunstan's latest post, in which he describes the end result of banning the client from the system. Save for rendering pretty, pretty screens I gather.

Good on him.