31 October 2016

Sweet 16

Regular reader might remember Bill Gates' comment that nobody would ever need more 640K. Turns out he's been shown to be right, yet again. Regular reader might also remember my often musing that Intel (and others) could have opted to use the fast growing transistor budgets to implement the X86 ISA(s) directly in hardware. But chose to go the RISC machine fronted by a JIT (sort of, anyway) "decoder" avenue.

Well, today we find this story explaining why Bill was basically right. You want more speed? Do the work with less cpu. Who knew? Well, me.
To reduce those energy demands, the researchers demonstrated how they used a conventional Intel chip and turned off half of its circuitry devoted to what engineers call mathematical precision. Then they "reinvested" the savings to improve the quality of the computed result.

Did they run down to merely 16 bits? Could be.
He suggested trading off precision to make dramatic gains in computing efficiency.

4 comments:

Roboprog said...

Robert,

The fellows making these chips have my sympathy, what with the sizes of wavelengths and atoms being a hard stop at the moment.

The hardware guys have done a pretty good job with caching tiers, but most software is done in a pretty oblivious manner. Locality? Say what what now? Cut n paste brain death forever!

Too many software "inliners", not enough "definers".

Inlining everything worked OK in small 80s programs on 5 or 10 MHz chips with 0 wait state RAM, but not so much on L1/L2/L3 ... cache schemes. More developers need to learn about "subroutines"/reuse, table-driven (arrays/records) methods, "command/interpreter" patterns, etc, rather than constantly sending these poor chips into a (cache) miss, miss, miss frenzy slogging through bloated, repetitive, code (for extra smell, through in some Java reflection on SuperCaliFragilisticExpiAlidocious type names in an XML file, blech).

There is some irony that you suggested that the hardware itself move away from a "microcode" type architecture to directly defining CISC, but that layer isn't going to switch out on every app.

Roboprog said...

s/through in some Java/throw in some Java/

Oops. Stupid morning fingers.

Robert Young said...

java? spent some years working on "RDBMS" with a bunch of such folks, the leads of which were smalltalk zealots. the notion of OO was simply data object, action object. IOW, fortran. bah. R folks are about the same; their objects are just tagged struct. Allen Holub did some writing on OO as data+method back in 99 and 00. even he later caved in and stopped complaining.

as for the RISC-machine being the "real" engine in Intel cpu, I don't expect Intel to reverse direction, just that the referenced approach amounts to removing some of the clutter of complexity that has resulted. using the transistor budget to implement the ISA would make the whole thing faster, I expect; IBM demonstrated clearly with the mainframe that hardware was always faster than software/firmware.

taking Intel's approach allows them to integrate more function in the die (I've been looking at die-shots for years, and the "core" footprint continues to diminish as time has gone on; there's not much new in ALU building, I expect) thus building something of a monopoly. whether ARM ever makes their life difficult is up in the air. some were speculating that the recent Mac announcements would include an ARM driven machine. didn't happen.

how much the true core changes with each tock seems to be minimal. but it does make the implementation of new X86 instructions somewhat easier, may be. OTOH, I've always wondered whether that's actually turned out. one can imagine that new instructions demand changes to the RISC ISA or layout or other stuff. they end up having to do "twice" the work.

Roboprog said...

I guess my beef was not so much a language thing, as that the typical "industrial" programmer just pastes stuff in over and over and changes the names. So much for L1 (and L2?) cache.

The alternative would be to learn some, any, "code reduction" techniques. Fanning into reused subroutines would be a start :-)

(whether we call them functions, abstract base methods, performs, ...)

ORM is another stinker of a memory sink.

Anyway, there's a laundry list of things to bring program size down, but don't expect to see many of them from your typical app grinder. Sadness.