Dr. Codd Was Right: But Liars Figure [update]

Lisa Murkowski, Swamp Critter

The world is not linear.
-- Dr. McElhone/1974

Power tends to corrupt; absolute power corrupts absolutely.
-- Lord Acton/1887

Officials who use their public positions for private gain threaten the integrity of our most important institutions. Greed makes governments — at every level — less responsive, less efficient and less trustworthy from the perspective of the communities they serve.
-- Justice Ketanji Brown Jackson/2024 [the MAGAnauts get even more aggressive when their perfidy is exposed]

I think we are on the verge of losing vaccines for this country, from this country. And the reason is that Robert F. Kennedy Jr. will hold up a paper, in the next four or five months, that says it's aluminum in vaccines that are causing a whole swath of problems, including autism. I think he is about to destroy vaccines in this country. I do.
-- Dr. Paul Offit/2025 [may the MAGA and MAHA be with you]

There's not a single example of things working out for the appeaser.
-- Nicolle Wallace/2024 [like this? the next extortion is on the way]

I have had to explain and re-explain and re-explain and re-explain, you know, how relational databases work, what is an eigenvector, what is dimensionality reduction.

-- Christopher Wylie/2018

... but Flash-based storage has such a different performance profile from rotating media, that I suspect that it will end up having a large impact on filesystem design. Right now, most filesystems tend to be designed with the latencies of rotating media in mind.

-- Linus Torvalds/2007

I believe quite strongly that, if you think about the issue at the appropriate level of abstraction, you're inexorably led to the position that databases must be relational.

-- Chris Date/2009

This Week's thought

The atmosphere is like a giant sponge. As the air gets warmer, which is what's been happening because of climate change, the sponge can hold a lot more water. And then when there's a storm, the same sponge can squeeze out way more water than it used to.
-- Arsum Pathak/2025 [may be if physics were a universal requirement in high school, we wouldn't have so many idiots?]

See you next week in a brand new show^{©Heckle and Jeckle}

Therefore:

In a time of SSD, multi-core/processor, two terabyte memory and Optane App Direct Mode (RIP) machines, there is no reason not to build from BCNF data. Time to do what Dr. Codd demonstrated. Technology has finally caught up with the maths.

17 April 2013

But Liars Figure [update]

If you're a policy wonk, or a regular reader of comprehensive newspapers, you're likely aware of the Reinhart & Rogoff controversy. Fact is, I hadn't been paying attention until today's reporting. The deja vu experience is charming; when I was at UMass, there was only Amherst and the economics department was busy purging anybody who wasn't Right Wing micro zealot. Boy howdy!

The report is here.

Of the critiques of the critique, this one appeals. And here's Krugman.

Anyway, was it intentional academic malfeasance? None of the reporting I've read says so. I'm not so nice.

The money quote, from Next New Deal:

They find that three main issues stand out. First, Reinhart and Rogoff selectively exclude years of high debt and average growth. Second, they use a debatable method to weight the countries. Third, there also appears to be a coding error that excludes high-debt and average-growth countries. All three bias in favor of their result, and without them you don't get their controversial result.

If you read through the various reports, the weighting scheme used by R&R comes up as a major source of criticism. In a nutshell, each country is reduced to one number for each of four categories, regardless of how long its data series is.
(HAP note that just adding an additional category, rather than 90+ makes a difference.

... we add an additional public debt/GDP category, extending by an additional 30 percentage points of public debt/GDP ratio--that is, we add 90-120 percent and greater-than-120 percent categories.

)

What isn't mentioned (in the reports I've viewed), is weights reflecting economy size. Both R&R and HAP treat all economies as equivalent. They certainly aren't.

HAP (page 8):

But equal weighting by country gives a one-year episode as much weight as nearly two decades in the above 90 percent public debt/GDP range.

It's hard to warrant that categorizing the data, particularly in the way they did, is justified. Well, unless one needs to make a case, much as lawyers will selectively pick facts to satisfy a client. This abject disdain for objectivity is the reason I abandoned economics after grad school.

Excel is for cowards. (Props to Bill Simmons; I think I've it got right this time.)

[update]
I've spent some time with the R&R paper, and here's the justification they give:

The four "buckets" encompassing low, medium-low, medium-high, and high debt levels are based on our interpretation of much of the literature and policy discussion on what is considered low, high etc debt levels. It parallels the World Bank country groupings according to four income groups. Sensitivity analysis involving a different set of debt cutoffs merits exploration as do country-specific debt thresholds along the broad lines discussed in Reinhart, Rogoff, and Savastano (2003).

And to further note: weighting by base GDP (for some agreed upon time point) by country has the felicitous effect of de-emphazing small economies with small absolute (in the global realm) GDP growths. A $1 gain for $10 looks a lot better than a $2 gain for $50. Or, "it's easier to grow fast when you start out small!" Just ask Apple, subject of another post today.

The thing is, what they've done (with, or without, the Blessing of the World Bank) is binned continuous data. It is incorrect to label this as "weighting", because it isn't. Binning, for these data, do not improve the explanatory power of the data. Furthermore, they all (R&R, and the critiques I've seen) all allow the outlier question to pass unasked. At 30% and 120% (see the HAP paper, figure 3) there are significant outliers, which (just happen to) push the model in the direction R&R's politics demand. It is well known that linear regression is sensitive to outliers. Further proof. Bad dog!

Dr. Codd Was Right

Lisa Murkowski, Swamp Critter

About

Shameless Plug

Extended Pieces

Good Stuff

Followers

Blog Archive

17 April 2013

But Liars Figure [update]

No comments: