[5] Memory API
Understanding Memory Allocation and Deallocation in C Programs
Types of Memory
Stack Memory
Allocations and deallocations are managed implicitly by the compiler, also known as automatic memory.
Declaring memory on the stack in C is simple. For example, if you need space for an integer called x in a function
func()
, you can do it like this:void func() { int x; // declares an integer on the stack ... }
The compiler handles the rest, allocating space on the stack when you call
func()
and freeing it when you return. If you need information to last beyond the function call, don't leave it on the stack.Heap Memory
Allocations and deallocations are explicitly handled by the programmer, requiring careful management to avoid bugs, such as when allocating an integer on the heap.
void func() { int *x = (int *) malloc(sizeof(int)); ... }
Here are a few notes about this small code snippet. First, you might notice that both stack and heap allocation happen on this line: the compiler makes room for a pointer to an integer when it sees your declaration
int *x
. Then, when the program callsmalloc()
, it asks for space for an integer on the heap. The function returns the address of this integer (or NULL if it fails), which is then stored on the stack for the program to use.Since heap memory is managed manually and used in different ways, it can be trickier for both users and systems. So, that's what we'll be focusing on.
The malloc() Call
The malloc()
call is quite simple: you pass it a size asking for some room on the heap, and it either succeeds and gives you back a pointer to the newly-allocated space, or fails and returns NULL.
The manual page shows what you need to do to use malloc; type man malloc
at the command line and you will see:
#include <stdlib.h>
...
void *malloc(size_t size);
...
You only need to include the header file stdlib.h
to use malloc. The C library, which all C programs link with by default, already contains the code for malloc(). Including the header helps the compiler check if you are calling malloc() correctly, such as passing the right number and type of arguments.
The single parameter malloc()
takes is of type size_t, which tells how many bytes you need. However, most programmers don't type a number directly (like 10); it's not a good practice. Instead, they use routines and macros. For example, to allocate space for a double-precision floating point value, you do this:
double *d = (double *) malloc(sizeof(double));
This call to malloc()
uses the sizeof()
operator to ask for the right amount of space. In C, sizeof()
is a compile-time operator, meaning the size is figured out when you compile the code. For this example the size is 8 bytes. This size is then given to malloc()
. Since sizeof()
works at compile time, it's seen as an operator, not a function call that happens at run time.
You can also use the name of a variable with sizeof()
, not just a type. But be careful, as it might not always give you the results you want. For example, consider the following code snippet:
int *x = malloc(10 * sizeof(int));
printf("%d\n", sizeof(x));
In the first line, we’ve reserved space for an array of 10 integers, which is correct. However, when we use sizeof()
in the next line, it returns 4 (on 32-bit machines) or 8 (on 64-bit machines). This is because sizeof()
is checking the size of the pointer to an integer, not the allocated memory. However, sometimes sizeof()
works as expected:
int x[10];
printf("%d\n", sizeof(x));
In this case, there is enough static information for the compiler to know that 40 bytes have been allocated.
Another place where you must be cautious is with strings. When allocating space for a string, always use this approach: malloc(strlen(s) + 1)
. This method gets the length of the string using strlen()
and adds 1 to make room for the end-of-string character. Using sizeof()
in this context can cause serious issues.
malloc()
returns a void pointer, giving back an address and letting the programmer decide what to do with it. The programmer then uses a cast; in our example, the return type of malloc()
is cast to a pointer to a double. The cast isn't needed for the code to work correctly.
The free() Call
Allocating memory is the easy part; knowing when and how to free it is the hard part. To free heap memory that is no longer needed, programmers just call free():
int *x = malloc(10 * sizeof(int));
...
free(x);
The routine takes one argument, a pointer returned by malloc()
. Thus, you might notice, the size of the allocated region is not passed in by the user, and must be tracked by the memory-allocation library itself.
The Hidden Dangers of malloc() and free(): Key Errors to Avoid
In C programming, managing memory with malloc() and free() often causes errors. Even if a program compiles and runs without warnings, wrong memory handling can create tricky bugs. Here are some common mistakes seen in undergraduate operating systems courses. Proper memory management is essential, and just compiling a program doesn't guarantee it will run correctly.
In fact, memory management has proven to be such a challenge that many modern languages offer automatic memory management via garbage collection. These languages still require memory allocation (often with new
), but they automatically free unused memory. A garbage collector identifies and releases memory that is no longer referenced by the program, thus preventing many of the issues outlined below.
Forgetting to Allocate Memory
Many functions expect memory to be allocated before they are called. For instance,
strcpy(dst, src)
copies a string from a source pointer to a destination pointer. If you forget to allocate memory for the destination pointer, your program may crash with a segmentation fault, as in the following example:char *src = "hello"; char *dst; // Unallocated memory. strcpy(dst, src); // Segfault and crash
To fix this, you need to allocate enough memory for dst before using it:
char *src = "hello"; char *dst = (char *) malloc(strlen(src) + 1); strcpy(dst, src); // Works correctly
Alternatively, you can use
strdup()
, which handles both allocation and copying for you.Not Allocating Enough Memory
Another common error is allocating insufficient memory, often leading to a buffer overflow. Consider this incorrect code:
char *src = "hello"; char *dst = (char *) malloc(strlen(src)); // Too small! strcpy(dst, src); // Might seem to work, but unsafe
In this case, the allocated memory is too small by one byte (since
strlen()
doesn’t account for the null terminator). Although this might appear to run without problems, writing past the allocated memory can corrupt other parts of your program or cause security vulnerabilities. The correct approach is:char *dst = (char *) malloc(strlen(src) + 1); // Enough space for null terminator
Forgetting to Initialize Allocated Memory
Even if you correctly allocate memory, forgetting to initialize it can lead to uninitialized reads. This means your program may access arbitrary values stored in the allocated memory, potentially causing unpredictable behavior. To avoid this, always initialize the memory before use:
int *data = (int *) malloc(sizeof(int) * size); for (int i = 0; i < size; i++) { data[i] = 0; // Initialize the array }
Forgetting to Free Memory (Memory Leaks)
A memory leak occurs when allocated memory is not freed, causing your program to consume more memory over time. In short-lived programs, this might not cause immediate issues, but in long-running applications or systems, memory leaks can lead to significant problems, eventually requiring a system restart.
Even in garbage-collected languages, memory leaks can occur if references to unused memory persist. Thus, it’s important to develop good habits early on by freeing memory when it’s no longer needed:
free(data); // Make sure to free allocated memory
Freeing Memory Before Use is Finished (Dangling Pointers)
Freeing memory before you’re done with it can cause crashes or overwrite valid memory. This error creates a dangling pointer, where the pointer still holds the address of freed memory but using it after the memory is deallocated is unsafe:
free(ptr); ptr[0] = 1; // Dangling pointer access! Can cause undefined behavior.
To avoid dangling pointers, it’s a good practice to set the pointer to NULL after freeing it:
free(ptr); ptr = NULL;
Freeing Memory More Than Once (Double Free)
Double freeing memory, where a program calls
free()
on the same pointer multiple times, is another dangerous mistake. The result is undefined behavior, often leading to crashes or corruption in the memory allocation library:free(ptr); free(ptr); // Double free: avoid this!
Calling free() on Invalid Pointers
Lastly, passing an invalid pointer to
free()
can cause severe issues.free()
should only be called with pointers returned bymalloc()
,calloc()
, orrealloc()
. Passing any other pointer, including stack-allocated memory or an arbitrary address, can lead to unpredictable results:int x; free(&x); // Invalid free! Only heap-allocated memory can be freed.
By understanding and avoiding these common memory management mistakes, you can develop safer, more reliable C programs. Always remember to allocate enough memory, initialize it properly, and free it correctly to avoid common pitfalls like segmentation faults, memory leaks, and security vulnerabilities.