1 #include <stdio.h> 2 #include <stdlib.h> 3 4 enum type { 5 T0 = 0, 6 T1, 7 T2, 8 T3, 9 T4, 10 T5, 11 TMAX = T5 12 }; 13 14 struct typed_item { 15 enum type t; 16 void *data; 17 }; 18 19 void test1(void) 20 { 21 struct typed_item *list[TMAX]; 22 char *success_string = "Success"; 23 int i = 0; 24 25 for (i = T0; i <= TMAX; i++) { 26 list[i] = malloc(sizeof(struct typed_item)); 27 list[i]->t = i; 28 } 29 30 /* do tests on list */ 31 32 printf("%s\n", success_string); 33 34 /* cleanup */ 35 } 36 37 int main(void) 38 { 39 test1(); 40 }Here's the console output:
sdann32# gcc stack.c && ./a.out
sdann32#
sdann32#
I expected "Success" but instead get a non-printable character and a newline. Can you spot the bug?
- struct typed_item *list[TMAX];
+ struct typed_item *list[TMAX+1];
This is a classic off-by-one error causing stack corruption. The for loop writes one entry past the end of the array and modifies the next variable on the stack. This was straightforward to track down in gdb. What was tricky, is that I couldn't repro such a simple bug consistently. When I ran this test on a different machine, everything worked fine.
My company is in the process of converting from a 32-bit userspace, to a 64-bit userspace. Because of the way our feature branches are managed in a source tree, often I'm switching between development on a 32-bit build and a 64-bit build. This bug only reproed on a 32-bit platform as seen above. The 64-bit system succeeded:
sdann64# gcc stack.c && ./a.out
Success
sdann64#
Success
sdann64#
Since my initial belief was that this issue was caused by stack corruption, I dropped into gdb on the 64-bit machine to examine the
test1()
stack variables. To see changes in the stack variables easier, I've initialized the array with a known value of 2.sdann64# gcc stack.c -g3 -O0
sdann64# gdb ./a.out
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
This GDB was configured as "amd64-marcel-freebsd"...
(gdb) b test1
Breakpoint 1 at 0x4005d9: file stack.c, line 21.
(gdb) run
Breakpoint 1, test1 () at stack.c:21
21 struct typed_item *list[TMAX] = { (void*)2, (void*)2, (void*)2, (void*)2, (void*)2 };
(gdb) n
22 char *success_string = "Success";
(gdb) n
23 int i = 0;
(gdb) n
25 for (i = T0; i <= TMAX; i++) {
(gdb) p $rbp
$1 = (void *) 0x7fffffffe870
(gdb) p $rsp
$2 = (void *) 0x7fffffffe820
(gdb) x /10xg $rsp
0x7fffffffe820: 0x0000000800902040 0x0000000000000002
0x7fffffffe830: 0x0000000000000002 0x0000000000000002
0x7fffffffe840: 0x0000000000000002 0x0000000000000000
0x7fffffffe850: 0x000000000040065d 0x00000000006fb08c
0x7fffffffe860: 0x0000000000000001 0x0000000000000001
sdann64# gdb ./a.out
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
This GDB was configured as "amd64-marcel-freebsd"...
(gdb) b test1
Breakpoint 1 at 0x4005d9: file stack.c, line 21.
(gdb) run
Breakpoint 1, test1 () at stack.c:21
21 struct typed_item *list[TMAX] = { (void*)2, (void*)2, (void*)2, (void*)2, (void*)2 };
(gdb) n
22 char *success_string = "Success";
(gdb) n
23 int i = 0;
(gdb) n
25 for (i = T0; i <= TMAX; i++) {
(gdb) p $rbp
$1 = (void *) 0x7fffffffe870
(gdb) p $rsp
$2 = (void *) 0x7fffffffe820
(gdb) x /10xg $rsp
0x7fffffffe820: 0x0000000800902040 0x0000000000000002
0x7fffffffe830: 0x0000000000000002 0x0000000000000002
0x7fffffffe840: 0x0000000000000002 0x0000000000000000
0x7fffffffe850: 0x000000000040065d 0x00000000006fb08c
0x7fffffffe860: 0x0000000000000001 0x0000000000000001
Once past the local variable initialization instructions I check the frame base pointer ($rbp) and the frame stack pointer ($rsp) and see that there's a 0x50 difference between them, or 80 bytes. Listing this portion of the stack gives the local variables for the function.
(gdb) x /10xg $rsp
0x7fffffffe820: 0x0000000000000002 0x0000000000000002
0x7fffffffe830: 0x0000000000000002 0x0000000000000002
0x7fffffffe840: 0x0000000000000002 0x0000000000000000
0x7fffffffe850: 0x000000000040065d 0x00000000006fb08c
0x7fffffffe860: 0x0000000000000001 0x0000000000000001
0x7fffffffe820: 0x0000000000000002 0x0000000000000002
0x7fffffffe830: 0x0000000000000002 0x0000000000000002
0x7fffffffe840: 0x0000000000000002 0x0000000000000000
0x7fffffffe850: 0x000000000040065d 0x00000000006fb08c
0x7fffffffe860: 0x0000000000000001 0x0000000000000001
Annotated with color above, the red section are saved registers and temporary variables. The three local variables are highlighted as such:
struct typed_item *list[TMAX];
char *success_string = "Success";
int i = 0;
Just as I suspected, when asking the compiler for 80 bytes of stack storage space, the allocation is padded to 96 bytes. It is this extra 16 bytes which allows my accidental off-by-one overwrite to have no functional affect on the program. I've never run into this issue on 32-bit platforms, so I'm now mindful that 64-bit stack padding can mask bugs that appear on other architectures.
(All testing done with Intel x64 processor, FreeBSD 7.3, and gcc 4.2.)
No comments:
Post a Comment