C99′s VLAs are evil

April 7, 2009
Tags: ,

While, at first glance, the ability to define an array, whose length is determined at runtime from a variable, seems like a good idea. Hey, it’s very clean and easy to understand, and all of the normal array operations work[1]; there is one major flaw—there is no way to deal with errors.

Variable Length Arrays, in C99 are defined just like any other array, except that the length doesn’t need to be a compile-time constant:


void foo(int size) {
    int array[size];
    for(int i=0; i<size; ++i) {
        array[i] = i;
    }
    ...
}

Calling the above function with any reasonable value will do the obvious thing, the array will be sized appropriately, will contain sequential numbers. sizeof(array) will even return size*sizeof(int). For the common case, everything works, and everybody is happy. But what happens when a size is too large to fit on the stack? Who knows.[2]

In the most common implementation of VLAs the compiler literally decrements the stack pointer by size and goes on it’s merry way, without regard to whether or not the stack pointer now points to an invalid address (or worse, a valid address somewhere inside of your program’s heap). If you’re lucky, and the stack pointer points to an invalid address, your program will likely crash. On the other hand, if you’re not so lucky, the seemingly innocuous function above will start trashing things in your program’s memory space; you may not catch it, you may have a strange, untraceable crash hours later, etc.. Needless to say, this can be quite difficult to debug.

The solution is to simply not use this feature of C99. If you’ve got a higher-level abstraction (such as std::vector in C++, NSData or NSArray in Cocoa) use that. If you really want to stick with pure, standard C, use malloc()/free()[3]:


void foo(int size) {
    int *array = malloc(size * sizeof *array);
    
    if(!array) {
        //Error out here
        return;
    }
    
    for(int i=0; i<size; ++i) {
        array[i] = i;
    }
    ...
    
    free(array);
}

  1. Even sizeof works, though that required a distasteful exception to the normal sizeof rules. Before VLAs, sizeof was always evaluated at compile time, and the resulting value was always a compile-time constant. When applied to VLAs, this obviously cannot be. []
  2. I’m not joking. The official term is “undefined behavior”. In such a case, the C standard doesn’t specify anything about what happens. []
  3. Don’t be tempted to use alloca(), it is not a standard C function, and has exactly the same issue that VLAs do []

12 Responses to “C99′s VLAs are evil”

  1. Thanks for sharing this. I’ve used VLAs in the past without even thinking about the implementation details – figured it was one of those ‘it just works’ situations.

    You mention that this is a problem for the most common implementation of VLAs… do you know which compilers that does/doesn’t include?

  2. I’d be very surprised to find a compiler that didn’t implement them this way. In order to do anything sensible, the compiler would have to:
    1) Know how much stack space was available, so it could determine whether or not the allocation would run off the end thereof.
    2) Have some way of failing gracefully, which would likely require some language-level exception mechanism, which C doesn’t have.

  3. You know, this really isn’t a new problem; it applies to compile-time-constant-length stack arrays as well. How do you know there’s room on the stack to allocate, say, int[1024]? You don’t.

    You *assume* the runtime has allocated enough room for the stack, and that your function isn’t being called at a point when the stack has grown to the point where your array doesn’t fit.

    The solution is the same in both cases: Use the stack responsibly. Don’t use it to store enormous arrays, and only use VLAs when the length is being passed or calculated from trusted data. Also, assert() is your friend.

  4. I agree with SNF. This is not a new problem at all. One should not allocate large work arrays in functions in this way. They should allocate them dynamically. If one does not want to worry about calling free() for every malloc() they can use alloca().

    C99 VLA does have a benefit for multi dimensional arrays. For example:

    void func(int n, double data[][n]) {

    }

    This cannot be done in C unless the number of columns is static. This can have huge benefits in scientific programming when pointer arrays cannot be used effectively.

  5. The case of using the VLA in the function parameter is a different issue. It won’t lead to the same kind of stack corruption, as you aren’t allocating that array on the stack, you’re merely receiving a pointer to it. In that case, you’re free to validate the value of (n) before using the array; you don’t have that option for arrays with automatic storage.

    The way C handles arrays and pointers in general is simultaneously one of its greatest strengths and one of its greatest weaknesses. VLA’s serve to obfuscate the issue, with very little benefit, and make it even easier to crash your program.

  6. Sure, that’s right, passing the VLA to the function will not lead to a stack overflow.

    Here is another example of allocating a VLA with malloc that is sometimes useful

    void func(int m, int n) {

    int i, j;
    double (*data)[n];

    data = malloc(m * n * sizeof(double));

    for (i = 0; i < m; ++i) {
    for (j = 0; j < n; ++j) {
    data[i][j] = 0.;
    }
    }

    /* do more stuff with data here */

    free((void *) data);
    }

  7. I disagree about the VLA having very little benefit. For numerical programming with a lot of matrices VLA’s can result in much cleaner code. The lack of VLA’s in pre C99 is one reason C was never widely used for numerical programming.

  8. All normal compilers do stack allocations correctly, not by simply decrementing the stack pointer. For example (gcc (GCC) 3.4.2 (mingw-special)):
    .00401730: 51 push ecx
    .00401731: 89E1 mov ecx,esp
    .00401733: 83C108 add ecx,008 ;”◘”
    .00401736: 3D00100000 cmp eax,000001000 ;” ► ”
    .0040173B: 7210 jb .00040174D —–↓ (1)
    .0040173D: 81E900100000 sub ecx,000001000 ;” ► ”
    .00401743: 830900 or d,[ecx],000 ;” ”
    .00401746: 2D00100000 sub eax,000001000 ;” ► ”
    .0040174B: EBE9 jmps .000401736 —–↑ (2)
    .0040174D: 29C1 sub ecx,eax
    .0040174F: 830900 or d,[ecx],000 ;” ”
    .00401752: 89E0 mov eax,esp
    .00401754: 89CC mov esp,ecx
    .00401756: 8B08 mov ecx,[eax]
    .00401758: 8B4004 mov eax,[eax][04]
    .0040175B: FFE0 jmp eax

  9. How about to do so:

    void foo(int size) {
    int asize = size;
    if (asize > 1024)
    asize = 1;
    int buffer[asize];
    int *array = buffer;
    if (asize != size)
    array = malloc(size * sizeof *array);

    if(!array) {
    //Error out here
    return;
    }

    for(int i=0; i array[i] = i;
    }

    if (array != buffer)
    free(array);
    }

  10. And the code can be wrapped in macro to simplify the works :)

    #define GET_BUFFER(size) ….
    #define FREE_BUFFER(…)

  11. Compilers like gcc is happy even if you pass -1 to a variable-length array. In this case, sizeof(array) is 0xFFFFFFFC.

  12. [...] III, a Cocoa developer and fairly frequent contributor to the cocoa-dev mailing list, points out a major flaw in C99’s variable length arrays (VLAs) on his blog today. While it is nice to be able to allocate arrays on the stack, I’m with Clark on this one: use [...]

Leave a Reply