This document describes a 64-bit error that can occur in C code when the malloc function is used without including the stdlib.h header file. Without the header file, the compiler assumes malloc returns an int instead of a 64-bit pointer. This can cause incorrect pointer values to be stored when memory addresses exceed 4GB. The error is demonstrated through code that allocates and uses 3 arrays of 1GB each. Removing the header file inclusion causes the program to crash after launch due to invalid pointer values.
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
A nice 64-bit error in C
1. A nice 64-bit error in C
Author: Andrey Karpov
Date: 19.11.2009
In C language, you may use functions without defining them. Pay attention that I speak about C
language, not C++. Of course, this ability is very dangerous. Let us have a look at an interesting example
of a 64-bit error related to it. Below is the correct code that allocates and uses three arrays, 1 GB each:
#include <stdlib.h>
void test()
{
const size_t Gbyte = 1024 * 1024 * 1024;
size_t i;
char *Pointers[3];
// Allocate
for (i = 0; i != 3; ++i)
Pointers[i] = (char *)malloc(Gbyte);
// Use
for (i = 0; i != 3; ++i)
Pointers[i][0] = 1;
// Free
for (i = 0; i != 3; ++i)
free(Pointers[i]);
}
This code correctly allocates memory, writes one into the first item of each array and frees the allocated
memory. The code is absolutely correct on a 64-bit system.
Now delete or comment the line "#include <stdlib.h>". The code still compiles but the program crashes
after the launch. As the header file "stdlib.h" is disabled, the C compiler considers that malloc function
will return int type. The first two allocations are most likely to be successful. After the third call, malloc
function will return the array's address outside the range of the first two Gbyte. As the compiler
considers the function's result to have int type, it interprets the result incorrectly and saves the incorrect
value of the pointer in Pointers array.
2. To make it clearer, let us consider an assembler code generated by Visual C++ compiler for the 64-bit
Debug version. At first look at the correct code generated when malloc function is defined (i.e. the file
"stdlib.h" is included):
Pointers[i] = (char *)malloc(Gbyte);
mov rcx,qword ptr [Gbyte]
call qword ptr [__imp_malloc (14000A518h)]
mov rcx,qword ptr [i]
mov qword ptr Pointers[rcx*8],rax
Now consider the variant of the incorrect code when malloc function is not defined:
Pointers[i] = (char *)malloc(Gbyte);
mov rcx,qword ptr [Gbyte]
call malloc (1400011A6h)
cdqe
mov rcx,qword ptr [i]
mov qword ptr Pointers[rcx*8],rax
Consider the CDQE instruction (Convert doubleword to quadword). The compiler supposed the result to
be kept in eax registers and extended it to a 64-bit value to write into Pointers array. Respectively, the
high-order bits of rax register are lost. Even if the address of the allocated memory is inside the range of
the first 4 GB, we still get the incorrect result when the high-order bit of eax register equals 1. For
example, the address 0x81000000 turns into 0xFFFFFFFF81000000.
Fortunately, this type of errors is easy to define. For example, Visual C++ compiler generates two
warnings informing about a potential problem:
warning C4013: 'malloc' undefined; assuming extern returning int
warning C4312: 'type cast' : conversion from 'int' to 'char *' of greater size
And PVS-Studio 3.40 analyzer generates the warning "error V201: Explicit type conversion. Type casting
to memsize.".