As my dive into stats, and possible departure from RDBMS as the site at the end of the Yellow Brick Road, continues, I came across a ruby library called fechell. My inital thought: "shouldn't that be fechall, as in Fetch All, Fetch Ell. What does that mean? Well, D'oh! The normal name for the code is FECHell. Ah, much more to the point.
I found two posts, by way of R-bloggers by the person who developed the library. Here's the post where he develops the use of the data and the library. He references a Part 1 post with the background.
This intrigues me not a little bit. Suppose, just for grins, that you're the campaign manager for a state wide (or larger) candidate. That is, one where monies are allocated to distinct locations. Further, suppose that you have this data in close to real-time, and you also have data measuring "outcome" for the use of these monies, say polling data. And let's say that the two maps, monies and outcomes, are congruent.
Could one make predictive decisions about monies allocations? Well, it depends. The naive' answer is: abso-freakin-lutely!!!! The real answer: not so much. The naive' notion is that money well spent is indicated by winning the election (which is kind of too late for allocation decisions) or some upward movement in polling data. Ah. Let's spend where the spending works. Superficially, makes a lot of sense.
The only problem: stat studies invariably show little correlation between money and winning. I know, Liberals in particular are worried about the Citizens United effect, where corporations have gobs more loot than anybody else. They'll just buy the elections. And they well might. This would not make me smile. But, the studies of the data show that the effectiveness of campaign ads is less grounded in their expense, rather their content. Sometimes, may be often, attack ads work.
Here's an academic attempt to find out.
And yet another.
A quote from the second story (not, that I know yet, cited from the study):
"While we see an influence of the campaign ad in the short-run, in the long run the ad loses its effectiveness. This finding begs the question: how cost effective is it for politicians to spend millions of dollars on campaign ads which have little long-term effect on voter opinion?"
StatMan to the rescue!!! The problem is that it's now August, 2011, and any application being written as I write (assuming that folks have started) need to be up and running by January. In order to be worth the time and money expended, the application has to have *predictive* value. FECHell data passed through some software is only retrospective. Political ops should know enough about their candidates and opponents to design ads that work. Making a simplistic leap from $$$ to polling/winning is a waste of that time and money. The retrospective data needs to be run through some multi-variate hoops (either multiple regression or ANOVA, most likely; PCA and MDS are less applicable here) to identify the attributes, besides money, which move the bar toward higher polling or winning.
The problem with the simplistic model is that the knee jerk reaction to positive feedback in some campaign is to toss yet more money to that campaign. But that's likely a waste of money. The goal is to use the data to identify those trailing candidates today who'll win tomorrow if they get more $$$ and *spend it on what works*. Pouring money into a winner is a loser. Pouring money down a rat hole is, too. The latter case is more obvious, but the former is just as wasteful.
Economists refer to "opportunity costs"; I can spend $1 on toothpaste or candy. I can't have both. In the short run, candy is dandy. In the long run, toothpaste wins. Campaigns don't, generally, last as long as the toothpaste's long run, but you get the point. Money is finite, and should be spent on those activities/goods/services which gain advantage to the goal. In the case of FECHell data, the goal is winning elections. Looking retrospectively only at $$$ and winners is just the wrong goal.