Saturday, March 28, 2015

32-bit to 64-bit Conversion Woes with strtoul()

Proper error checking when using the strtol() family of functions is notoriously difficult.  See this stackoverflow thread explaining it.

Focusing only on strtoul() here, the correct error checking in most cases is:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
unsigned long
mystrtoul(char *str, bool *success)
{
    unsigned long val = 0;
    char *eptr = NULL;
    errno = 0;

    *success = true;

    val = strtoul(str, &eptr, 0);
    if (eptr == str || *eptr != '\0' ||
        (val == ULONG_MAX && errno == ERANGE) ||
        (val == 0 && errno == EINVAL))
        *success = false;

    return val;
}

That is:
  • IF your end pointer points to the beginning of your string, you've got a problem
  • IF your end pointer doesn't point to the end of your string, you've got a problem
  • IF your output is at the max end of the range (a potentially valid value) AND you get errno, you've got a problem
  • IF your output is 0 (a potentially valid value) AND you get errno, you've got a problem
  • ELSE congratulations, you've converted a string to an integer
Understandably, a lot of people get this error checking wrong when using this function. I've reviewed source code I work on and found mistakes.  But even if you got all of this right, you may still have gotten it wrong.

If you originally wrote this code for a 32-bit system, with the intention of converting and storing a 32-bit number, then what you wrote is correct.  If you then recompile this correct code on a 64-bit system, it becomes incorrect.

That's right with no modifications (and no compiler warnings), your carefully reviewed integer conversion function goes from right to (dangerously) wrong.

For example give as input "4294967296" and assign to an unsigned int:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
int
main(void)
{
    bool success = false; 
    unsigned int val = 0;

    val = mystrtoul("4294967296", &success);
    printf("%s %u\n", success?"success":"failure", val);

    return 0;
}

Result:
  • On 32-bit Linux:  failure 4294967296 (ERANGE)
  • On 64-bit Linux:  success 0
The 32-bit compilation catches the integer overflow and errors out.  The 64-bit version does not and rolls back around to 0.  So your output is wrong and the usual possible errors and security holes with integer overflows apply:
  • malloc(4294967296) allocate 0 bytes and a buffer overflow is created
  • uid_t 4294967296 become 0 and a user is root
And strings passed in to strtoul() are very often user controlled input.  That's why you need to convert them from an on-the-wire, CLI, or on-disk string format to an in-memory integer.

This isn't strictly a strtoul() problem.  This is a LP64 data model problem.  Any assumption about the 32-bit size of a long needs to be checked when converting from 32-bit to 64-bit runtime.  This is just a particularly tricky example of the trade offs made by LP64.