One of the points "for further research" as I used to say when I was an academic, in the Triage exercise was using social media to measure outcomes. R has a library, twitteR, (yes, R folks tend to capitalize the letter at every opportunity), which retrieves some data. I was at first disinterested, since I don't have a twitter account. Thankfully, twits can be gotten without being a twitterer. Since Elizabeth Warren's campaign is just over the border, and sort of important in the grand scheme of things, I've been exploring.
Here's the entirety of the R code (as seen in an Rstudio session) needed to return the twits (1,500 is the max, which will prove troublesome when the battle is fully engaged):
The lines starting with > is the R code. The lines starting with [x] are the output. Here we have 9 twits.
Now, what do we do with the text? For that, I'll send you off to this presentation which came up in my R/twitter search (and is the source of what you've seen here), conducted in Boston. Missed it, dang. With slide 11, is the explanation of how one might parse the twits looking for positive/negative response. By the way, even if you're not the least bit interested in such nonsense, visit slide 29.
As I mentioned in Triage and follow-ups, getting the outcomes data is the largest piece of the work. Simply being able to "guarantee" the accuracy of twitter (or any other uncontrolled source) data, given the restriction on returned twits and such, will require some level of data sophistication; which your average Apparatchik likely doesn't care about. The goal, I'll mention again, isn't to emulate Chris Farley's Matt Foley and pump up a candidate no matter what the data say, but to find the candidate out of many most likely to win given some help. Whether Triage would be useful to a single candidate; well, that depends on the inner strength of the candidate.
20 December 2011
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment