Tuesday, March 25, 2008

This isn't new, but if you are a programmer and haven't read this then you must spend the time to read and fully understand this
PDF
written by Ulrich Drepper from Redhat.
It's an amazingly good summary of almost all too much information about memory and CPU systems and how they interact to make every program run much slower then it needs to be. It goes into the detail of how RAM is actually constructed and goes through all the memory systems in use by common hardware today. With tests and benchmarks to backup all his claims and to show the dramatic difference that good code can make.

The only problem I had with his paper, and maybe this isn't a problem with the paper as much as an issue with programming in general is that it doesn't cover non-hardware level optimizations that people can do. Most of the optimizations that one can do at the level of memory are just not possible in most 3G+ languages (Java, C#, Python, Perl, ECMAScript, ActionScript, etc.). Some are, like minimizing the memory footprint of your data structures. But how else are all those non-c/asm/c++ level programmers supposed to increase their utilization of memory.

It's the curse of being abstracted so far away from the hardware. No way to optimize at this level. It's not like Java can prefetch memory into a the vector units or stream directly to/from main memory without using the cache. I guess you could but then you'd be calling into a c program which is outside the normal bounds of a Java program since technically you're no longer in Java.

Not that I'm saying that programmers need a way to do every single trick in the book from all languages, but having no control over your memory is almost as big of a curse as having to control the layout of every byte.