This is true, but in the case where files are read only, just reading directly from the files with fread()/read()/etc works pretty well. You do have to pay the cost of a system call and a copy from the OS buffer cache into your user-space buffer, but OTOH when the page isn't in the buffer cache, the cost of reading the required data from storage is more predictable than the cost of faulting in all the 4kb pages you're reading.