Dr. Codd Was Right: A Warren-ted Search

Vaccinated ≠ Not Infectious

$536,800,000 MARF^™(party like it's 1829)
-- New York Slammers, so far/2024 [GA and The Feds still to come]

Covid-19 has killed at least 1,123,836 people (as of 20 March 2023, final update)

Scientists aren't vocal enough about science. There are large groups of people who think scientists are all frauds and who don't believe in science, and they're being cultured by some of our far right-wing politicians, religious leaders and community leaders.
-- Drew Weissman/2022

To date, Microsoft is stating that organizations testing In-Memory OLTP have seen transaction speeds improve by up to 30 times compared to past performance, with the best performance gains achieved when the business logic resides in the database and not in the applications.
-- Jonathan Watts/2015 [my emphasis]

I have had to explain and re-explain and re-explain and re-explain, you know, how relational databases work, what is an eigenvector, what is dimensionality reduction.

-- Christopher Wylie/2018

... but Flash-based storage has such a different performance profile from rotating media, that I suspect that it will end up having a large impact on filesystem design. Right now, most filesystems tend to be designed with the latencies of rotating media in mind.

-- Linus Torvalds/2007

I believe quite strongly that, if you think about the issue at the appropriate level of abstraction, you're inexorably led to the position that databases must be relational.

-- Chris Date/2009

This week's thought

Investigative reporters are an idiosyncratic breed of journalist. Typically fearless, they are often a source of angina to their editors. Mr. Walsh was no exception.
-- Michael S. Rosenwald/2024 [no surprise - I spent a bit of time on Jack Anderson's staff]

Therefore:

In a time of SSD, multi-core/processor, two terabyte memory and Optane App Direct Mode machines, there is no reason not to build from BCNF data. Time to do what Dr. Codd demonstrated. Technology has finally caught up with the maths.

20 December 2011

A Warren-ted Search

One of the points "for further research" as I used to say when I was an academic, in the Triage exercise was using social media to measure outcomes. R has a library, twitteR, (yes, R folks tend to capitalize the letter at every opportunity), which retrieves some data. I was at first disinterested, since I don't have a twitter account. Thankfully, twits can be gotten without being a twitterer. Since Elizabeth Warren's campaign is just over the border, and sort of important in the grand scheme of things, I've been exploring.

Here's the entirety of the R code (as seen in an Rstudio session) needed to return the twits (1,500 is the max, which will prove troublesome when the battle is fully engaged):

> library(twitteR)
> warrenTweets <- searchTwitter('@elizabethwarren', n = 1500)
> length(warrenTweets)
[1] 9
> warren.Text <- laply(warrenTweets, function(t) t$getText())
> head(warren.Text, 10)
[1] "@elizabethwarren i hope you win agianst sen scott brown. the 99% r with u"
[2] "@elizabethwarren More $$$ coming your way!"
[3] "#HR3505 PAGING: @ElizabethWarren Help us!!!!"
[4] "@elizabethwarren - not to worry, the only job Karl Rove ever got somebody was George W. Bush. and look how that turned out."
[5] "RT @SenatorBuono: What an amazing turnout 4 a superstar. @elizabethwarren"
[6] "HELLO @ElizabethWarren ! PLEASE RUN as a 3rd party or Ind. FOR POTUS2012. Dems just threwSENIORS underthebus for the working tax cut! EXdem"
[7] "@chucktodd We hope 2011 will be remembered for something a LOT closer to home. #ows #OccupyWallStreet @ElizabethWarren #WARREN/PELOSI-2016"
[8] "RT @SenatorBuono: What an amazing turnout 4 a superstar. @elizabethwarren"
[9] "What an amazing turnout 4 a superstar. @elizabethwarren"

The lines starting with > is the R code. The lines starting with [x] are the output. Here we have 9 twits.

Now, what do we do with the text? For that, I'll send you off to this presentation which came up in my R/twitter search (and is the source of what you've seen here), conducted in Boston. Missed it, dang. With slide 11, is the explanation of how one might parse the twits looking for positive/negative response. By the way, even if you're not the least bit interested in such nonsense, visit slide 29.

As I mentioned in Triage and follow-ups, getting the outcomes data is the largest piece of the work. Simply being able to "guarantee" the accuracy of twitter (or any other uncontrolled source) data, given the restriction on returned twits and such, will require some level of data sophistication; which your average Apparatchik likely doesn't care about. The goal, I'll mention again, isn't to emulate Chris Farley's Matt Foley and pump up a candidate no matter what the data say, but to find the candidate out of many most likely to win given some help. Whether Triage would be useful to a single candidate; well, that depends on the inner strength of the candidate.

Dr. Codd Was Right

Vaccinated ≠ Not Infectious

About

Shameless Plug

Extended Pieces

Good Stuff

Followers

Blog Archive

20 December 2011

A Warren-ted Search

No comments: