Linux Fu: Mapping Files

If you use C or C++, you have probably learned how to open a file and read data from it. Usually, we read a character or a line at a time. At least, it seems that way. The reality is there are usually quite a number of buffers between you and the hard drive, so your request for a character might trigger a read for 2,048 characters and then your subsequent calls return from the buffer. There may even be layers of buffers feeding buffers.

A modern computer can do so much better than reading using things using old calls like fgetc. Given that your program has a huge virtual address space and that your computer has a perfectly good memory management unit within it, you can ask the operating system to simply map the file into your memory space. Then you can treat it like any other array of characters and let the OS do the rest.

The operating system doesn’t necessarily read the entire file in at one time, it just reserves space for you. Any time you hit a page that isn’t in memory, the operating system grabs it for you invisibly. Pages that you don’t use very often may be discarded and reloaded later. Behind the scenes, the OS does a lot so you can work on very large files with no real effort. The call that does it all is mmap.

Of course, there is always a catch. If you have a truly large file, you might have to do some work to map it partially and then map it again. Also creating or extending files is a bit more work using mapping. Still, memory mapping is easy to do in most common cases and well worth learning about.

Decisions

The first thing you have to decide is if you want to read the file or both read and write to the file. If you only need read access, you can ask for a private mapping. That means you’ll get the file as it exists and any changes you make will simply copy the pages to your own private copy. Typically, you won’t change files that you open like this anyway unless you create a new file to write changes to yourself.

However, if you want to write to the file just like you write to memory you’ll need a shared mapping. This can be used to share data with other processes, but it also makes sure the file gets updates as you make them — well, sort of. We’ll talk about msync a bit later.

Reading is Fundamental

You can find example code online for a simple word count program similar to wc (mmwc). Instead of using standard I/O calls (stdwd), it uses open to open the file for reading and then maps it into the program’s address space. We need to know the file size: a job for stat. Here’s part of the code:

int fd = open(filename, O_RDONLY); // open file
struct stat finfo;
char *b;
if ( fd == -1 )
  {
  perror(filename);
  return 2;
  }
if ( fstat( fd, &finfo ) == -1 ) // learn size of file
  {
  perror(filename);
  return 3;
  }
b=mmap( NULL, finfo.st_size, PROT_READ, MAP_PRIVATE, fd, 0 ); // map to memory
if ( b == MAP_FAILED )
  {
  perror("mmap");
  return 4;
  }

The arguments to mmap are simple. The first is an address. You almost never need to specify the address unless you are doing something exotic. If you do, there are many rules about how to set the address that vary based on platform. By specifying NULL, mmap will pick a spot for you. The next argument is a length followed by flags. In this case, we tell the system we only care to read the file. We also specify a private mapping and then the filehandle and an offset from the start of the file.

Once this call succeeds, the b variable has a pointer to the entire file in memory. Printing it out would be as easy as:


while (--len) putchar *b++;

The following code implements a simple word counting engine. Note the function do_work never uses any call that would relate to the input file. It simply processes data in memory:


while (len--) // process each character
 {
 if ( *b == '\n' ) l++;
 if ( isspace(*b) )
  {
  if (state==1)
   {
   w++;
   state=0;
   }
  }
  else state=1;
  b++;
}

Writing and the msync Consideration

Writing is a little trickier because the file is shared and also needs to have contents already. You might, for example, use lseek to set a file position and just write something at the end so the file was preloaded.

Before Linux 2.6.19, you had to call msync to make sure the file wrote to disk, but now you don’t. However, if you think your code may run on older kernels, it might be wise to use msync when writing to a mapped file.

In the example program mmup.c, we sidestep the size problems by working on the file in place. I also put a call into msync for good measure. The mmap line is similar to the previous version:


b = mmap( NULL, finfo.st_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0 );

Look at how simple the conversion code is:


int do_work( char *b, unsigned int len )
  {
  while (len--)
    {
    *b=toupper(*b);
    b++;
    }
  return 0;
}

Isn’t that easy?

Compilers and libraries are pretty good these days, so I’ll leave performance measurements as an exercise for the interested. A lot will depend on how good your disk I/O system is, too. In theory, the memory mapped I/O should be faster than a program that is really doing disk I/O. However, libraries may be doing buffering and even mapping behind your back anyway, so the performance difference could be very slight. But the simplification in the code is a big plus regardless of the performance.

Perfect?

This seems great, but you should be aware of some potential issues. We are used to thinking that if we read some data from disk and nothing goes wrong, we can forget about it. But mapping a file isn’t the same as reading it. Suppose you map a file on a network drive and maybe read some pages out of it. Then the network goes down. A new page read will cause a fault because the underlying file is now gone. You can catch the SIGBUS signal that indicates this, but then what will you do?

Of course, if you still support 32-bit operating systems, you may find you quickly run out of address space if you process large files. True, you can make a smaller window with an offset in the file, but it is more work.

On the other hand, mmap makes many file handling programs simpler and easier to write. It is worth having in your arsenal of Linux programming tricks.

7 thoughts on “Linux Fu: Mapping Files

  1. The correct title for this article is “how to create security issues in your code without even trying”.

    If NFS is broken then your process will just hang forever waiting for it. If it’s a hard NFS mount you will need to reboot if you want to kill the process. If it’s a soft mount you can use ^C or kill to kill the process.

    If the file is big you will be putting a strain on the memory allocator if you mmap the whole file at once. If the file is small then there is no benefit from this optimization and you might as well use fprintf instead of inserting stuff into the buffer.

    it’s really easy to write bad code in these situations that will overrun the buffer. Again if you don’t believe me, go look at CVE reports and see that most of them are “incorrect code overruns allocated buffer” or some such. This should make it clear to you that even the best professional programmers cannot handle the simple task of managing buffer writes. For best results, stick to streams and fprintf and avoid “clever” code like this.

  2. mmap can be the best and the worst…
    and IT will chose which, unless you put so many checks that it’s unreadable.

    mmap is great for the linux framebuffer.

    OTOH I’ve seen programs badly blow computers because they wanted to mmap() a huuuuge file…

  3. I had to quickly check the date to see if it was 1st April…

    Without performance timings, and looking at use cases, it’s invariable a stupid idea to do what this article is describing. Yes, in some cases, it’s a good idea – but in most it isn’t.

    ie in one of the most common use cases – sequentially reading a whole file with light data processing – it’s not only simplier but faster to just do sequential reads (well, it has been on all the systems I’ve tested..)

    1. I totally agree. If you have a need for speed it may be a necessary evil. Perhaps an audio editor or a video editor or something of that ilk, but for the bread and butter dealing with files you are just asking for problems and not really doing much if any better than using regular time tested file manipulation calls.

Leave a Reply to NCancel reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.