C99′s VLAs are evil

While, at first glance, the ability to define an array, whose length is determined at runtime from a variable, seems like a good idea. Hey, it’s very clean and easy to understand, and all of the normal array operations work ((Even sizeof works, though that required a distasteful exception to the normal sizeof rules. Before VLAs, sizeof was always evaluated at compile time, and the resulting value was always a compile-time constant. When applied to VLAs, this obviously cannot be.)); there is one major flaw—there is no way to deal with errors.

Variable Length Arrays, in C99 are defined just like any other array, except that the length doesn’t need to be a compile-time constant:


void foo(int size) {
    int array[size];
    for(int i=0; i<size; ++i) {
        array[i] = i;
    }
    ...
}

Calling the above function with any reasonable value will do the obvious thing, the array will be sized appropriately, will contain sequential numbers. sizeof(array) will even return size*sizeof(int). For the common case, everything works, and everybody is happy. But what happens when a size is too large to fit on the stack? Who knows. ((I’m not joking. The official term is “undefined behavior”. In such a case, the C standard doesn’t specify anything about what happens.))

In the most common implementation of VLAs the compiler literally decrements the stack pointer by size and goes on it’s merry way, without regard to whether or not the stack pointer now points to an invalid address (or worse, a valid address somewhere inside of your program’s heap). If you’re lucky, and the stack pointer points to an invalid address, your program will likely crash. On the other hand, if you’re not so lucky, the seemingly innocuous function above will start trashing things in your program’s memory space; you may not catch it, you may have a strange, untraceable crash hours later, etc.. Needless to say, this can be quite difficult to debug.

The solution is to simply not use this feature of C99. If you’ve got a higher-level abstraction (such as std::vector in C++, NSData or NSArray in Cocoa) use that. If you really want to stick with pure, standard C, use malloc()/free() ((Don’t be tempted to use alloca(), it is not a standard C function, and has exactly the same issue that VLAs do)):


void foo(int size) {
    int *array = malloc(size * sizeof *array);
    
    if(!array) {
        //Error out here
        return;
    }
    
    for(int i=0; i<size; ++i) {
        array[i] = i;
    }
    ...
    
    free(array);
}

12 Comments

Whitney YoungApril 8th, 2009 at 14:20

Thanks for sharing this. I’ve used VLAs in the past without even thinking about the implementation details – figured it was one of those ‘it just works’ situations.

You mention that this is a problem for the most common implementation of VLAs… do you know which compilers that does/doesn’t include?

Clark CoxApril 9th, 2009 at 08:34

I’d be very surprised to find a compiler that didn’t implement them this way. In order to do anything sensible, the compiler would have to:
1) Know how much stack space was available, so it could determine whether or not the allocation would run off the end thereof.
2) Have some way of failing gracefully, which would likely require some language-level exception mechanism, which C doesn’t have.

SNFMay 7th, 2009 at 07:48

You know, this really isn’t a new problem; it applies to compile-time-constant-length stack arrays as well. How do you know there’s room on the stack to allocate, say, int[1024]? You don’t.

You *assume* the runtime has allocated enough room for the stack, and that your function isn’t being called at a point when the stack has grown to the point where your array doesn’t fit.

The solution is the same in both cases: Use the stack responsibly. Don’t use it to store enormous arrays, and only use VLAs when the length is being passed or calculated from trusted data. Also, assert() is your friend.

GRMJuly 4th, 2009 at 14:36

I agree with SNF. This is not a new problem at all. One should not allocate large work arrays in functions in this way. They should allocate them dynamically. If one does not want to worry about calling free() for every malloc() they can use alloca().

C99 VLA does have a benefit for multi dimensional arrays. For example:

void func(int n, double data[][n]) {

}

This cannot be done in C unless the number of columns is static. This can have huge benefits in scientific programming when pointer arrays cannot be used effectively.

Clark CoxJuly 4th, 2009 at 15:30

The case of using the VLA in the function parameter is a different issue. It won’t lead to the same kind of stack corruption, as you aren’t allocating that array on the stack, you’re merely receiving a pointer to it. In that case, you’re free to validate the value of (n) before using the array; you don’t have that option for arrays with automatic storage.

The way C handles arrays and pointers in general is simultaneously one of its greatest strengths and one of its greatest weaknesses. VLA’s serve to obfuscate the issue, with very little benefit, and make it even easier to crash your program.

MRGJuly 8th, 2009 at 22:24

Sure, that’s right, passing the VLA to the function will not lead to a stack overflow.

Here is another example of allocating a VLA with malloc that is sometimes useful

void func(int m, int n) {

int i, j;
double (*data)[n];

data = malloc(m * n * sizeof(double));

for (i = 0; i < m; ++i) {
for (j = 0; j < n; ++j) {
data[i][j] = 0.;
}
}

/* do more stuff with data here */

free((void *) data);
}

MRGJuly 8th, 2009 at 22:24

I disagree about the VLA having very little benefit. For numerical programming with a lot of matrices VLA’s can result in much cleaner code. The lack of VLA’s in pre C99 is one reason C was never widely used for numerical programming.

NoN@meOctober 10th, 2009 at 07:27

All normal compilers do stack allocations correctly, not by simply decrementing the stack pointer. For example (gcc (GCC) 3.4.2 (mingw-special)):
.00401730: 51 push ecx
.00401731: 89E1 mov ecx,esp
.00401733: 83C108 add ecx,008 ;”◘”
.00401736: 3D00100000 cmp eax,000001000 ;” ► ”
.0040173B: 7210 jb .00040174D —–↓ (1)
.0040173D: 81E900100000 sub ecx,000001000 ;” ► ”
.00401743: 830900 or d,[ecx],000 ;” ”
.00401746: 2D00100000 sub eax,000001000 ;” ► ”
.0040174B: EBE9 jmps .000401736 —–↑ (2)
.0040174D: 29C1 sub ecx,eax
.0040174F: 830900 or d,[ecx],000 ;” ”
.00401752: 89E0 mov eax,esp
.00401754: 89CC mov esp,ecx
.00401756: 8B08 mov ecx,[eax]
.00401758: 8B4004 mov eax,[eax][04]
.0040175B: FFE0 jmp eax

cyberscorpioFebruary 25th, 2010 at 18:07

How about to do so:

void foo(int size) {
int asize = size;
if (asize > 1024)
asize = 1;
int buffer[asize];
int *array = buffer;
if (asize != size)
array = malloc(size * sizeof *array);

if(!array) {
//Error out here
return;
}

for(int i=0; i array[i] = i;
}

if (array != buffer)
free(array);
}

cyberscorpioFebruary 25th, 2010 at 18:11

And the code can be wrapped in macro to simplify the works :)

#define GET_BUFFER(size) ….
#define FREE_BUFFER(…)

RoharAugust 3rd, 2010 at 19:49

Compilers like gcc is happy even if you pass -1 to a variable-length array. In this case, sizeof(array) is 0xFFFFFFFC.

[...] III, a Cocoa developer and fairly frequent contributor to the cocoa-dev mailing list, points out a major flaw in C99’s variable length arrays (VLAs) on his blog today. While it is nice to be able to allocate arrays on the stack, I’m with Clark on this one: use [...]

Leave a comment

Your comment