29 September 2012

Carnac Wins Again

A few musings ago, I speculated that the folding smartphone was the next big thing. It's just been reported that Apple has patented the basic notion; perhaps without need of an external hinge. Will the iPhone 6 have the form factor of the original Motorola Razr? Reactionary capitalism, at its finest.

26 September 2012

Money For Nuthin, The Picture

So, what's the data look like? Among Real Quants, pie charts are viewed as stinky doo, whilst Suits love 'em. Figures. I'm not a big fan of them, not so surprisingly, but these data are simple enough, and the chart stark enough (whilst still being honest) that it can't be passed up.



You'll likely need to expand it to see the labels, although the two that mean something are legible.

Money For Nuthin

With all the moaning about the "fairness" of capital gains taxation, it's about time that this endeavor took a look, don't you think? There's both quant aspects and macro aspects.

On the quant side, is: what numbers matter? From the point of those who assert that capital gains is totally unearned income (as opposed to those who get itchy with the word 'unearned'), the bull stock market since March 2009 is the exemplar. Here's what happens.
- you buy shares of CNO (Conseco Insurance, though the name has been changed) in March, 2009 at $.26.
- you ride the rebound to, at least, March, 2010 and sell.
- you sell the shares, which reached $10.18 last friday.
- you pay, modulo other shenanigans, 15%.

That's a gain of $9.92/share, or a bit more than 38 times gain. For which neither you, nor that money, did ABSOLUTELY ANYTHING. One can say, "but my money was at risk. CNO could have stayed down to $.20 or even gone bankrupt." And that's true. You GAMBLED that the person who sold you the stock was stupid to sell so cheap. S/he, on the other hand, gambled that you were a fool to pay so much. NONE of the money went to CNO to run its business. You did nothing more than a horse plunger. You look at historical information. You assessed where CNO's (or whatever company) prospects were going. You looked at the general market. And you read J.K. Galbraith, "Financial genius is a rising market."

There are other forms of capital gains. After much exploration, here is table of the various asset categories reported by the IRS for 2007 (the latest). More than half is purely financial transactions. Gains on residences is, wait for it, 1% of the total. Read that again, 1%.

On the macro side, we have to assert either that the USofA is based on a progressive tax system, or not. If we are, then capital gains, especially with the likes of Bain shifting "income" into "capital gains" deserve no special treatment, particularly given how little capital gain is due to physical investment. It's just a financial manipulation.

Is that so difficult to figure out?

23 September 2012

Listen to the Bald Headed Guy

As regular reader must have figured out by now, while I dearly love stats/quant, I've been skeptical of mathy staty economics/business ever since grad school when I was subjected to a bunch of flunked out math and physics Ph.D. re-purposed into assistant professors of economics. This was the mid-1970s, ancient history to most alive today. The Great Recession's seeds were sown that long ago.

In the movie "Taras Bulba", Yul Brynner in the title role tells his son that he must go live with the Poles to understand how they think so that the Cossacks can win back what they've lost. I feel much the same about "financial engineering", an oxymoron ranking with "happily married" as the apotheosis of irony. So, in a desultory way, I've been reading David Ruppert's "Statistics and Data Analysis for Financial Engineering". Today was the chapter on copulas (which, if one uses them, must be termed copulation, yes?). Wait for it: we sure got copulated by Wall Street.

Each chapter has a Bibliographic Notes, which is usually a bunch of reference to papers in the professional literature. However, here is a reference to a piece in Wired, by Felix Salmon. Boy Howdy! It was written in the early days of the Great Recession, Spring 2009, just before or after this endeavor went public. I was unaware of it until now. It is eerie, reading it now. I do disagree that the quant invasion happened in the 80s; it was a decade earlier.

To reiterate: my issue with Wall Street quants is that many (most? all?) have little understanding of either macro or micro economics. The micro folks believe that each actor is independent while the macro folks understand that such a view is naive' at best.

One of my favorite aphorisms (attributed to any Good Mother, and presented a few times already): "what would the world be like if everybody behaved like you?". This was said to misbehaving kids.
...the real danger was created not because any given trader adopted it but because every trader did. In financial markets, everybody doing the same thing is the classic recipe for a bubble and inevitable bust.

Conspiracy nuts would, perhaps did at the time, have fun with the fact that the inventor of this particular copula (there are a host of specific ones; copula is a general definition, so there are many variations) is a Chinese named Li. Was he sent here to crash the American economy? Only the time will tell. As of the date of the piece he had returned to China and was working in banking.

What Salmon doesn't get into is the specifics of why CDOs and CDSs came to be so loved by the Wall Street folks. Here, the answer is Greenspan and Dubya: they had crashed interest rates on Treasuries, leaving all that Chinese money and American pension money and what have you looking for greater "risk free" returns. As I've said more than once here, returns (real world variety) can only come from better production of and sufficient demand for the increased production of goods; there really isn't much point in making physical investment just to produce what you already do (modulo firing most of your employees, see Mother's Advice above). Home mortgages provide no such. Some within the economics profession have talked, for decades, about "psychic utility" and its measure in housing. Here's a piece which skewers it well and truly. If it sounds familiar, just delve into the early musings here and you'll find his arguments and more.

Returning to Salmon's piece.
...because an unlimited number of credit default swaps can be sold against each borrower, the supply of swaps isn't constrained the way the supply of bonds is, so the CDS market managed to grow extremely rapidly.
In other words, while, in my opinion Wall Street investing is really just gambling twixt buyers and sellers of stocks and bonds, this was very much a step further into wagering.

What Li, and any of the quants who bought his story, relied on was the truth of The Efficient Market Hypothesis. That is, those pricing both CDOs and CDSs were always the Smartest Guys in The Room. The problem here is simple: the financial engineering folks rely wholly and explicitly on some amount of historical data, and almost all of that data is some single time series. Imagine your local weather person saying that the day was dry and sunny, because her forecasting model said so, but never bothered to look out at the downpour in the parking lot. Such was the simplicity of the error made by the Wall Street quants. They wanted to, vampire squid style, suck ever more moolah from the saver-to-borrower stream, and there was all that Chinese money just itching for some place to sit. It was not in their individual best interest (see Mother's Advice above) to question the wisdom of the plan. One need look, as early as 2002, no further than the (median house price / median income) metric to know that someone was lying. House prices were soaring, but median income was flat. No amount of utility shifting could account for the divergence.
And Li didn't just radically dumb down the difficulty of working out correlations; he decided not to even bother trying to map and calculate all the nearly infinite relationships between the various loans that made up a pool.

It isn't as if all these mortgage based CDSs and CDOs were built on new and better production. They weren't. All that held them up was the incomes of the home buyers. Nothing else. Said home buyers might derive a whole lot of psychic utility from a McMansion they'd never dreamed they'd ever be living in. Psychic utility doesn't pay the mortgage. Some academic economists, going back at least to when I was in school, have questioned American's perverse "investment" in housing. Europe doesn't waste capital that way. We shouldn't, either.

The numbers are staggering:
The CDS and CDO markets grew together, feeding on each other. At the end of 2001, there was $920 billion in credit default swaps outstanding. By the end of 2007, that number had skyrocketed to more than $62 trillion. The CDO market, which stood at $275 billion in 2000, grew to $4.7 trillion by 2006.
God may not play dice with the world, but Wall Street is more than happy to play dice with other people's money. "Abandon hope all ye who enter here."

So there you have it. Just as I've described, but with some contemporary reportage. The results of TARP and the QEs? A soaring stock market in the face of economic stagnation. How can that be? Now that all that Chinese money and American pension money doesn't have AAA rated bonds (and their derivatives) to buy, where else to go? Yup, the squid. It's deja vu all over again.

20 September 2012

Talk Like a Duck

Could the SSD revolution already be over? Last week, Western Digital (actually the Hitachi folks they recently bought) announced the imminent release of helium filled hard drives. This story doesn't mention size, price, or availability. The only numbers: rather than five platters, these drives will have seven. That's a 40% increase, and for less power.

Dum da dum dum. Short stroking a hoard of these might be just as cost effective as equivalent SSD. Don't know yet. But could be. Does it matter? In a sense no, in that the point of SSD, so far as I've been concerned, is that SSD was an obvious finger pointing to high normal form databases as the solution to both byte bloat and application performance. For single disk machines, RAID/10 isn't an option, of course, but the point has been database servers, where a RAID set versus SSD is a feasible trade off.

Like a Ton of Bricks

Somewhere in this canon (or the other one; way too lazy to go looking), I mused that the future of eTailing was retailing. The reasoning is straightforward: the cost of fuel, air and ground, will only rise. We either run out of crude or we substantially reduce how much we use so that we have enough air to breath. In the near term, it's unambiguous that rail transport is an order of magnitude cheaper than air, and nearly so for highway ground.

The advantage of brick & mortar retail is inventory; it can be moved in bulk, thus by rail and at substantial savings. At the time I mused, I didn't consider the shift back to brick & mortar to be all that swift. I didn't count on the folks at Amazon to have figured it out already. Here's a recent piece on Amazon's brick & mortar effort. Rather than you taking a short drive to Target, Amazon takes an almost short drive to you. Who said size doesn't matter? The losers? UPS and FedEx and the Post Office. They still do the last mile, so to speak, but not the bulk of the delivery.
Amazon's delivery of everyday objects needs to be fast enough and cheap enough to wean customers from their local stores. Yet it also must be economically feasible for the retailer, which is investing so heavily in the warehouses that it is barely profitable.

Some residents want more. "They want to be able to order something and then drive down the street to the warehouse and pick it up," said Rod Butler, the city manager. Here, just like everywhere else, shoppers dream of same-day delivery.
And what makes this better? The stated impetus was having to collect sales tax, but I just don't buy it. The cost of transport is significant. Just for yucks, one could use Amazon's build out as a test of the infamous Traveling Salesman Problem. For this instance, one need analyze the goods sold by region (cluster analysis sounds about right; one need define both the region and its goods' list concurrently), the size of the warehouse needed to supply those goods by rail (again, a region need have sufficient shipments, by value, to justify its boundaries), which would have to account for volume of goods (rail is effective for large movements). The most significant parameter is meeting demand lead time on the list of goods to be kept in the regional warehouses. The goal would be to meet some percentage, say 99.5%, of value shipped out of a warehouse to the region. Since these buildings, and if you read the links you'll find tax breaks abounding, still don't move much, definition of "region" is critical. Low volume/value would be shipped out of "central" stores, likely three; each coast and the midwest. The regional warehouses would be stocked by rail. I'd bet, but haven't got data, that the warehouses are quite near, if not on sight to, railheads.

Surprise, surprise. A quick search turns up this story.
Given the prime location in the Midwest with 4,700 miles of mainline rail track, three international airports and more than 11,000 total highway miles, it's no surprise that Amazon has already invested heavily in placing fulfillment centers there.
If you follow the Delaware link in that story, and then search for railheads in Middletown, Delaware, there's Norfolk Southern. Brick & mortar wins after all.

The Real Work is in the Microcodd

A while back, I noted that Intel is implementing transactional memory in its new processor. AnandTech has a piece explaining (no algebra!) how it works. What's amusing is that the piece adopts a RDBMS metaphor!
The root of the locking problems is that locking is a trade-off. Suppose you have a shared data structure such as a small table or list. If the developer is time constrained, which is often the case, the easiest way to guarantee consistency is to let a certain thread lock the entire data structure (e.g. the table lock of MySQL MyISAM).

According to Intel, using an application that previously used a coarse grained lock (like the older MyISAM storage engines of MySQL) together with a TSX enabled library should improve scaling spectacularly.

I wonder if Dr. Codd is smiling in his grave? When early RDMBS engines were being written, mainstream multi-processors were as rare as hen's teeth (this was about the same time as Connection Machine and such, but much of relational databasing was done on unix/AIX/HP-UX/etc. vanilla minis; well and a bit on the 370). I doubt many engines were parallelized, they just spun a lot (read the piece).

So, we have transactional cpu with large memories with fast SSD primary store. Can high normal form databases be around the corner? Enquiring minds want to know. They should be, of course.

19 September 2012

Another Brick in The Wall

We don't need no education...

The Left Wing, and a bit of the Right Wing (sort of), are playing the education tune as the panacea to the "middle class is dead" problem. Alas, it can't work. The flawed assumption is that our (global) economy can absorb millions of folks doing well paid middle class work. The issue is the same as the robot ruminations of late: high value work is high value just because it's highly leveraged. That doesn't necessarily mean guild-like restrictions, but more fundamentally input-output; no matter the price, there just isn't the need for millions of middle class workers. Here's a quote about the iPhone folks:
Apple sent the entire 16-member design team to the award presentation and the entourage followed Ive on stage to receive the award.

So, to be clear, Apple moves 2 million 5s in 24 hours. 16 people created that device. Now, that's LEVERAGE. And it means that education isn't the solution.

It's the Distribution, Stupid!

13 September 2012

Filching the Teacher's Apple

Continuing. How, then, to staunch Apple's crusade for world domination? A company run, until recently, by an avowed Buddhist? Somehow, the two can't really be true. But I digress.

You read it here first. Or, at the least, I've not read it anywhere.

The issued "solved" by the 5 is to put a reasonable form factor back into mobile phones, a conclusion not yet realized by mainstream pundits. They're all gaga about the iTunes and such. You do know that you never *own* any of those songs, yes?

What would provide a useful alternative? Well, a flip phone that's still all screen when opened. I believe it can be done. All that's required is a bit of smarter mechanical engineering (my uncle was an MIT ME, and claimed that all problems in products were solved through the MEs' efforts). There are lots of hinges in the real world that match up to within a ten-thousandth's of an inch. Corning would have to figure out Gorilla Glass such that the line between the two plates is a pixel or so. That shouldn't be a big deal. The bigger deal will be the boards. With two part chassis, some volume is required to accommodate the ribbon connector. With the solid phone, board layout is not so encumbered. But it would work out. The convenience of the self-protecting flip phone is not to be ignored.

May be Nokia?

New Day Just Like the Old Day

An earlier post, on the Apple/Samsung tiff, also mentioned my bewilderment at the original iPhone, specifically its size and shape. Five years on and, using the wide screen excuse (16:9 has become the de facto standard), the 5 is pretty much the dimensions of my old flip phones. I think that's what's called reactionary.

11 September 2012

I Demand Transactional Immunity for My Testimony!

Intel's slides from the Haswell announcement are at AnandTech, here. One of the more amusing aspects of Intel's continuing CPU evolution is the increasing congruence with real relational database semantics. One wonders whether MicroSoft's integrated file system, cum RDBMS (just like the AS/400 of decades ago), might not be dead; just re-emerging as another Wintel effort? See, slide 14, 16. I wonder how the "eventual consistency" peanut gallery will swallow this?

There's also the ARM stuff; how the RISCy business will put Intel in a bind. Once again, Haswell documents that the real processor inside that CPU is a RISC machine. Has been for years. That Intel hasn't split out the hard core is something of a puzzlement. The knee jerk answer is that Intel has lots of secret sauce it doesn't want exposed.

Mail Time, Part 1

In the spirit of PTI, a basso voice from above calls, "Mail Time!" In today's episode, a month or so old article from ComputerWorld, dealing with tests of SSD and related issues.
...when targeted at specific applications, such as ... online relational databases, the costs to achieve the same performance with flash compared with HDDs can be vastly lower...

Now, if these so-called tech reporters knew enough about the RM, they'd press folks on the data reduction advantage of High NF and why they did, or more likely didn't, go that route. But, of course they don't. Just as reporters in any other specialty area, they're still generalists who don't know enough to avoid the BS.

05 September 2012

Carnac, the Magnificent

Do you ever get that deja vu feeling before it happens? So today has gone. Early this AM, I checked my email, and a new discussion popped up from one of the LinkedIn groups I follow (Insurance, blah blah). The OP stated, among other things, that in the data warehouse world 1) the relational database was bad and 2) data storage was too much. This led me to comment, as one might expect.

Here is the text:
A- with multi-processor/core/thread/massive memory/SSD machines cheaper than a Starbucks' latte, stars and snowflakes need no longer be the crutch of DW. High normal form databases, in orders of magnitude fewer bytes than flatfiles (still the typical structure in Insurance, alas), stars or snowflakes, are just far more capable than folks who haven't accepted Codd as their saviour will admit. Get rid of the bloat, and the database just flies

B- IBM, vendor to the legacy stars, now has SPSS and Texas Memory in-house. SPSS is the SQL friendly stat pack (so is R, but that's another episode), and Texas Memory by far and away the most experienced and capable developer of SSD. PREDICTION: IBM/DB2 will now find Codd and promote high normal form databases, for the simple reason that they now can make mucho dinero with that meme.

That bit of wisdom was posted a couple of hours ago. So what shows up in my inbox a couple of minutes ago, but a posting on a Database Developers group:
Next DB2 Tech Talk: Get smart on Realtime Operational Warehousing

Join us September 11 to learn more about the benefits and advantages of realtime operational warehousing as compared to non-realtime warehouses and standard database apps. You will also learn about new features in DB2 10 and InfoSphere Warehouse 10, including query performance enhancements and Continuous Data Ingest that are designed with the realtime operational warehouse in mind.

Now, I have to concede that neither this flyer nor the announcement itself mentions SSD or Texas Memory or SPSS. But for a two hour return on predictive investment, and I'll certainly argue now that SSD and SPSS will soon enough be part of the discussion, that ain't bad.

03 September 2012

It's Alive!!

My beloved Triage appears to be alive and well, but in stealth mode. I've not actually had the pleasure of meeting it. But the Times yesterday kind of let the cat out of the bag. If this isn't a description of Triage driven campaign building, I'd be hard pressed to do better. What's galling is that the Democrats happily ignored the issues in 2010, and have set us on the road to permanent minority rule. Just like a South American junta.

Issenberg, a he by the way, is publishing a book, "The Victory Lab: The Secret Science of Winning Campaigns". Alas, I wasn't a source. No royalties for me.

Over the last decade, almost entirely out of view, campaigns have modernized their techniques in such a way that nearly every member of the political press now lacks the specialized expertise to interpret what's going on. Campaign professionals have developed a new conceptual framework for understanding what moves votes. It's as if restaurant critics remained oblivious to a generation's worth of new chefs' tools and techniques and persisted in describing every dish that came out of the kitchen as either "grilled" or "broiled."

My first serious post college position was in Washington, DC (for the Civil Service Commission, which no longer exists; bet you didn't know that), in a group titled Office of Analytic Methods. I was the economist/econometrician, while one of the other worker bees was AbD in psychometrics. For reasons not yet discussed, I've long viewed any study prefixed psych- with suspicion; a means for the venal to manipulate the naive'. Advertising, "Mad Men" style, is the archetype. Eventually, we got "The Selling of the President 1968". Some 40 years on, and the number crunching has gotten evermore convoluted, possibly more sophisticated.

Oh, did I mention that I wandered into my local Barnes & Noble to see what's newish in the data/stat world? Yes, yes I did. And what to my wondering eyes did appear but "R For Dummies". I suppose that R will become the next Excel: any knucklehead will feel empowered to play math stat in the office, just as Excel empowered cube monkeys to self-identify as financial analysts. And we now know what that produced.

Campaigns have borrowed techniques from the social sciences, including behavioral psychology and statistical modeling. They have access to private collections of data and from their analysis of it have been able to reach empirical, if tentative, conclusions about what works and what doesn't.

And to quote my humble self, from Triage:

There is, available to the apparatchiks, both public data (the FEC here in the States) and data developed by their own organization. This latter data is, amorphously, expenditures (the source data that ends up at the FEC; their own they have, but opposition data must wait for FEC and may well not be sufficiently timely) and outcomes; perhaps simple polling results; perhaps some focus group results; perhaps some name recognition surveys. Social network data mining is also big these days (although I've not done enough research to know for sure that this could be a data source for outcomes).

Issenberg throws in the towel:

Breathless, and often fact-free, stories about "data mining" and "microtargeting" soon became plentiful. But few journalists had access to any of the campaigns' data, or even much understanding of the statistical techniques they used. We found ourselves at the mercy of self-promoting consultants who described how they were changing politics by ignoring stodgy old demographics and instead pinpointing voters according to their lifestyles. We played along, guilelessly imputing new mythic powers to microtargeting. In many retellings, data analysis became the reason George W. Bush was re-elected.

There has been, in the wake of Ryan's perversion, hand wringing from some of the press that fact-checking (which effort draws the ire of the Right Wingnuts, not surprisingly) in the face of such lying will take too much effort to police. The message is that Right Wingnuts will send a tsunami of falsehood, much never exposed as forcefully as the assaults. "Swift Boats" 24/7. Welcome to the new Gulag.

Indeed, the telling numbers wouldn't be polls but the individual probability scores that Mr. Obama's targeters developed (and update weekly) to predict how likely each voter in the country is to support him.

As Triage described, high granularity data, external to the campaigns can be used. One of the not so secret secrets in the quant world is that private databases exist, for a fee, to very fine detail. As you bend the mind, so you bend the finger on the voting lever.

But particularly in a polarized race like this one, where fewer than one-tenth of voters are moving between candidates, the most advanced thinking inside a campaign is just as likely to focus on fine-tuning statistical models to refine vote counts and improve techniques for efficiently identifying and mobilizing existing supporters.

So, we find:

...Mr. Romney deployed statistical models to track Iowa supporters and current vote counts for his rivals. It amounted to a largely invisible 21st-century upgrade to the traditional infrastructure of offices, phone banks and staff that most journalists visualized when they tossed around the term "organization."

Back in the 1980s I applied to, and was accepted into, the American University (the one in Washington, DC) Economic Journalism graduate program. For various reasons, I didn't get to go (you know who you are). I still have, more so recently with the advent of R in particular, much regret that I wasn't able to watch and participate in this evolution.