|
|
I pride myself on thinking outside of the Klein Bottle.
-rbarry
This is an issue that has been making my brain itch for the better part of two
decades now. If you don't have a pretty good idea of what virtual memory is
and how it works, you'll want to skip this post and move on to something
lighter.
Virtual memory is a tool that we, as programmers, use every minute of every
day. Just about every line of code you write has the potential to trigger this
dark invocation of grand arcana. You write:
a++;
...and your hardware may take off into an intricate cascade of operations whose
sole purpose it is to find out what the value of a is. Depending upon your
architecture, you could be looking at loading 3 pages from long-term storage.
If your laptop is anything like the hunk of crap upon which I am authoring this
hunk of crap, then your disk is spinning at 5,400RPM... and that one line of
code could be lining you up for a delay of around 16 milliseconds. Do the math
on this one: 5400RPM = 90RPS = 1/180th second average delay per page. Three
such accesses comes to 3/180 of a second or 1/60 second. That loads the second
and tertiary page tables (64 bit architectures) and the page that contains a's
address. If you plopped down a pretty penny for your drive, you might have a
10,000 or 15,000RPM drive, and you'll only see a 5-6 millisecond delay. Can I
get a sarchastic "Woo Hoo" from the audience? That's a quarter of your time
slice gone in one postincrement.
(As a side note, I love my SSD, but that's a story for later.)
Granted, once you have loaded this page, everything in it is in main memory and
will be quicker to access via L1, L2, etc., caches....
But.
The loading of this page (and the requisite page tables, which I'll ignore from
now on) is an implied product of the reference to the data it contains. This
load takes place when I need the data and not a cycle before. My machine will
attempt to be helpful while this sort of thing is going on by taking my process
off the CPU and scheduling something else, but I spend a lot of my life waiting
for my machine to finish the one thing that I am asking it to do right now.
If any of these tasks are inherently data-intensive, I find myself wondering
how much of that time is being spent waiting on page faults - and wondering if
I should replace every disk I own with an SSD and damn the expense.
My regular readers (both of them) are well aware that I am prone to bitching
at random - and about random topics - but I do save my truly passionate rants
for those issues which seem to have blindingly obvious solutions. This is one
of them.
As a painfully simple example, lets consider the program,
int main()
{
int i, j, r, g, b, brightest=0, this_brightness;
FILE *in = fopen("source.ppm", "rb");
while(fgetc(in) != '\n');
for(i=0; i<2048; i++)
for(j=0; j<2048; j++)
{
r=fgetc(in);
g=fgetc(in);
b=fgetc(in);
this_brightness = r+g+b;
if(this_brightness > brightest) brightest = this_brightness;
}
printf("%d\n", brightest);
}
Okay, admittedly, it does next to nothing, but searching data is a common
enough operation that it will work for our discussion.
The file in question contains just over 12M of data. Sticking with our 64-bit
architecture, that means that we'll experience 12M/8k or 1536 page faults.
I"M PUTTING THIS ENTRY ON PAUSE SO I CAN GET CONCRETE ANSWERS FROM OSX AND
DTRACE
|