What is the fastest way to allocate memory?

Arturo Devesa
Apr 21, 2020
11 min read

Updated: May 19, 2020

malloc in C, new in C++.

Not VirtualAlloc.

Dynamic memory allocation is when an executing program requests that the operating system give it a block of main memory. The program then uses this memory for some purpose. Usually the purpose is to add a node to a data structure.

In object oriented languages, dynamic memory allocation is used to get the memory for a new object.

The memory comes from above the static part of the data segment. Programs may request memory and may also return previously dynamically allocated memory. Memory may be returned whenever it is no longer needed. Memory can be returned in any order without any relation to the order in which it was allocated. The heap may develop "holes" where previously allocated memory has been returned between blocks of memory still in use.

A new dynamic request for memory might return a range of addresses out of one of the holes. But it might not use up all the hole, so further dynamic requests might be satisfied out of the original hole.

If too many small holes develop, memory is wasted because the total memory used by the holes may be large, but the holes cannot be used to satisfy dynamic requests. This situation is called memory fragmentation. Keeping track of allocated and deallocated memory is complicated. A modern operating system does all this.

Comparing Memory Allocation Methods

The following is a brief comparison of the various memory allocation methods:

CoTaskMemAlloc
GlobalAlloc
HeapAlloc
LocalAlloc
malloc
new
VirtualAlloc

Although the GlobalAlloc, LocalAlloc, and HeapAlloc functions ultimately allocate memory from the same heap, each provides a slightly different set of functionality. For example, HeapAlloc can be instructed to raise an exception if memory could not be allocated, a capability not available with LocalAlloc. LocalAlloc supports allocation of handles which permit the underlying memory to be moved by a reallocation without changing the handle value, a capability not available with HeapAlloc.

Starting with 32-bit Windows, GlobalAlloc and LocalAlloc are implemented as wrapper functions that call HeapAlloc using a handle to the process's default heap. Therefore, GlobalAlloc and LocalAlloc have greater overhead than HeapAlloc.

Because the different heap allocators provide distinctive functionality by using different mechanisms, you must free memory with the correct function. For example, memory allocated with HeapAlloc must be freed with HeapFree and not LocalFree or GlobalFree. Memory allocated with GlobalAlloc or LocalAlloc must be queried, validated, and released with the corresponding global or local function.

The VirtualAlloc function allows you to specify additional options for memory allocation. However, its allocations use a page granularity, so using VirtualAlloc can result in higher memory usage.

The malloc function has the disadvantage of being run-time dependent.

The new operator has the disadvantage of being compiler dependent and language dependent.

The CoTaskMemAlloc function has the advantage of working well in either C, C++, or Visual Basic. It is also the only way to share memory in a COM-based application, since MIDL uses CoTaskMemAlloc and CoTaskMemFree to marshal memory.

Stack vs Heap

Stack is managed by the compiler.

Heap is managed by the programmer.

Stack is used for static memory allocation and Heap for dynamic memory allocation, both stored in the computer's RAM . Variables allocated on the stack are stored directly to the memory and access to this memory is very fast, and it's allocation is dealt with when the program is compiled.

When a function or a method calls another function which in turns calls another function etc., the execution of all those functions remains suspended until the very last function returns its value. The stack is always reserved in a LIFO order, the most recently reserved block is always the next block to be freed. This makes it really simple to keep track of the stack, freeing a block from the stack is nothing more than adjusting one pointer.

Variables allocated on the heap have their memory allocated at run time and accessing this memory is a bit slower, but the heap size is only limited by the size of virtual memory . Element of the heap have no dependencies with each other and can always be accessed randomly at any time. You can allocate a block at any time and free it at any time. This makes it much more complex to keep track of which parts of the heap are allocated or free at any given time.

You can use the stack if you know exactly how much data you need to allocate before compile time and it is not too big. You can use heap if you don't know exactly how much data you will need at runtime or if you need to allocate a lot of data.

In a multi-threaded situation each thread will have its own completely independent stack but they will share the heap. Stack is thread specific and Heap is application specific. The stack is important to consider in exception handling and thread executions.

Dynamic Memory Allocation

Just before a program starts running, the loader copies machine code from the executable file into the text segment of memory. It also copies data from the executable file into the data segment of memory. Source code declares a fixed amount of memory (in the .data section for assembly language). But often as a program runs it requests more memory for data. The operating system finds a block of available memory and allocates it to the program. This is dynamic memory allocation.

Chapter Topics:

Dynamic Memory Allocation
Records and Structures
Subroutine Calls with Structures
C program versions of the above

Dynamic memory allocation is used in assembly language and high level languages for building data structures. In object oriented languages it is used to get the memory used to construct objects.

QUESTION 1: (Review: ) How many addresses does MIPS main storage have?

Review of Memory

there are 2^32— about four billion bytes available in main memory, each one with its own address. Most of these bytes are virtual. They are implemented by the virtual memory system and do not directly correspond to bytes in RAM chips. But to executing programs it looks as if all the bytes are present.

Usually, with a multitasking system, the processor is shared among many executing programs. But the operating system makes it appear to each executing program as if it is the only one running on the system. Also, the operating system makes it look to each program as if 2^31 bytes are there for it alone. (The other half of memory is for the operating system.)

program as running by itself in two billion bytes of memory and the operating system running in the other two billion bytes.

The picture shows a program running in the bottom half of memory and the operating system in the top half. The operating system divides the memory available to a program into sections. This is not required by hardware, but makes managing computer operations easier. The Text Segment holds the machine language of the program. The Data Segment holds the memory that has been allocated to the program just as it starts. The Stack Segment is for the run-time stack.

Between the data segment and the stack segment there is unallocated memory that is used for growing the stack (shown as the downward-pointing arrow at the top) or for creating dynamic data structures (shown as the upward-pointing arrow at the bottom). Both of these activities occur as the program is running.

The portion of memory above the data segment that has been allocated to data structures is called the heap.

Unit of address resolution

Most modern computers are byte-addressable. Each address identifies a single byte (eight bits) of storage. Data larger than a single byte may be stored in a sequence of consecutive addresses. There exist word-addressable computers, where the minimal addressable storage unit is exactly the processor's word. For example, the Data General Nova minicomputer, and the Texas Instruments TMS9900 and National Semiconductor IMP-16 microcomputers used 16 bit words, and there were many 36-bit mainframe computers (e.g., PDP-10) which used 18-bit word addressing, not byte addressing, giving an address space of 218 36-bit words, approximately 1 megabyte of storage. The efficiency of addressing of memory depends on the bit size of the bus used for addresses – the more bits used, the more addresses are available to the computer. For example, an 8-bit-byte-addressable machine with a 20-bit address bus (e.g. Intel 8086) can address 220 (1,048,576) memory locations, or one MiB of memory, while a 32-bit bus (e.g. Intel 80386) addresses 232 (4,294,967,296) locations, or a 4 GiB address space. In contrast, a 36-bit word-addressable machine with an 18-bit address bus addresses only 218 (262,144) 36-bit locations (9,437,184 bits), equivalent to 1,179,648 8-bit bytes, or 1152 KB, or 1.125 MiB—slightly more than the 8086.

Some older computers (decimal computers), were decimal digit-addressable. For example, each address in the IBM 1620's magnetic-core memory identified a single six bit binary-coded decimal digit, consisting of a parity bit, flag bit and four numerical bits. The 1620 used 5-digit decimal addresses, so in theory the highest possible address was 99,999. In practice, the CPU supported 20,000 memory locations, and up to two optional external memory units could be added, each supporting 20,000 addresses, for a total of 60,000 (00000–59999).

Does the Stack Segment also have memory holes in it?

No. Recall that a stack has a "last-in, first-out" organization. Unneeded memory is always returned from the top of the stack.

SPIM Dynamic Memory

A stack is an easily managed structure. Only a few memory addresses are needed to keep track of it. (Some of these addresses are in the stack pointer and in the frame pointer registers.) As a program executes, the stack grows and shrinks as subroutines are called and exited. The heap is more like a book shelf. Books are constantly being taken off the shelf from various locations, leaving gaps, and then later returned, filling the gaps. Here is how a SPIM program requests a block of memory from SPIM's heap. The program uses code 9 for get more space. li $a0,xxx # $a0 contains the number of bytes you need. # This must be a multiple of four. li $v0,9 # code 9 == allocate memory syscall # call the service. # $v0 <-- the address of the first byte # of the dynamically allocated block You don't know in advance what range of addresses you will get back for the allocate memory request. The SPIM service returns the first address of a contiguous block with as many bytes as you requested in $a0. (This is similar to a call to malloc() in "C".)

Example, Python List class, uses Dynamic Arrays and Dynamic Memory Allocation

Dynamic Arrays in Python, or Lists, don't need to know how large an array is beforehand.

Arrays are fixed in C++.

In C or C++ or Java, you have to specify how large an array is when you initialize it.

// declares an Array of integers. 
 int[] arr; 
  
 // allocating memory for 5 integers. 
 arr = new int[5];

A larger example here:

Dynamic memory

In the programs seen in previous chapters, all memory needs were determined before program execution by defining the variables needed. But there may be cases where the memory needs of a program can only be determined during runtime. For example, when the memory needed depends on user input. On these cases, programs need to dynamically allocate memory, for which the C++ language integrates the operatorsnewanddelete.

Operators new and new[]

Dynamic memory is allocated using operatornew.newis followed by a data type specifier and, if a sequence of more than one element is required, the number of these within brackets[]. It returns a pointer to the beginning of the new block of memory allocated. Its syntax is:

pointer = new type

pointer = new type [number_of_elements]

The first expression is used to allocate memory to contain one single element of typetype. The second one is used to allocate a block (an array) of elements of typetype, wherenumber_of_elementsis an integer value representing the amount of these. For example:

int * foo; foo = new int [5];

In this case, the system dynamically allocates space for five elements of type int and returns a pointer to the first element of the sequence, which is assigned to foo (a pointer). Therefore,foonow points to a valid block of memory with space for five elements of type int.

Here, foo is a pointer, and thus, the first element pointed to by foo can be accessed either with the expression foo[0]or the expression*foo(both are equivalent). The second element can be accessed either with foo[1] o r*(foo+1), and so on...

There is a substantial difference between declaring a normal array and allocating dynamic memory for a block of memory using new. The most important difference is that the size of a regular array needs to be aconstant expression, and thus its size has to be determined at the moment of designing the program, before it is run, whereas the dynamic memory allocation performed by new allows to assign memory during runtime using any variable value as size.

The dynamic memory requested by our program is allocated by the system from the memory heap. However, computer memory is a limited resource, and it can be exhausted. Therefore, there are no guarantees that all requests to allocate memory using operator new are going to be granted by the system.

C++ provides two standard mechanisms to check if the allocation was successful:

One is by handling exceptions. Using this method, an exception of type bad_allocis thrown when the allocation fails. Exceptions are a powerful C++ feature explained later in these tutorials. But for now, you should know that if this exception is thrown and it is not handled by a specific handler, the program execution is terminated.

This exception method is the method used by default by new, and is the one used in a declaration like:

foo = new int [5]; // if allocation fails, an exception is thrown

The other method is known as nothrow, and what happens when it is used is that when a memory allocation fails, instead of throwing a bad_allocexception or terminating the program, the pointer returned by new is anull pointer, and the program continues its execution normally.

This method can be specified by using a special object called nothrow, declared in header<new>, as argument for new:

foo = new (nothrow) int [5];

In this case, if the allocation of this block of memory fails, the failure can be detected by checking iffoois a null pointer:

int * foo; foo = new (nothrow) int [5]; if (foo == nullptr) { // error assigning memory. Take measures. }

This nothrow method is likely to produce less efficient code than exceptions, since it implies explicitly checking the pointer value returned after each and every allocation. Therefore, the exception mechanism is generally preferred, at least for critical allocations. Still, most of the coming examples will use the nothrow mechanism due to its simplicity.

Operators delete and delete[]

In most cases, memory allocated dynamically is only needed during specific periods of time within a program; once it is no longer needed, it can be freed so that the memory becomes available again for other requests of dynamic memory. This is the purpose of operatordelete, whose syntax is:

delete pointer;

delete[] pointer;

The first statement releases the memory of a single element allocated using new, and the second one releases the memory allocated for arrays of elements using new and a size in brackets ([]).

The value passed as argument to delete shall be either a pointer to a memory block previously allocated with new, or anull pointer(in the case of anull pointer, delete produces no effect).

// rememb-o-matic #include <iostream> #include <new> using namespace std;  int main () {   int i,n;   int * p;   cout << "How many numbers would you like to type? ";   cin >> i;   p= new (nothrow) int[i];   if (p == nullptr)     cout << "Error: memory could not be allocated";   else   {     for (n=0; n<i; n++)     {       cout << "Enter number: ";       cin >> p[n];     }     cout << "You have entered: ";     for (n=0; n<i; n++)       cout << p[n] << ", ";     delete[] p;   }   return 0; }How many numbers would you like to type? 5 Enter number : 75 Enter number : 436 Enter number : 1067 Enter number : 8 Enter number : 32 You have entered: 75, 436, 1067, 8, 32, Edit & Run

Notice how the value within brackets in the new statement is a variable value entered by the user (i), not a constant expression:

p= new (nothrow) int[i];

There always exists the possibility that the user introduces a value foriso big that the system cannot allocate enough memory for it. For example, when I tried to give a value of 1 billion to the "How many numbers" question, my system could not allocate that much memory for the program, and I got the text message we prepared for this case (Error: memory could not be allocated).

It is considered good practice for programs to always be able to handle failures to allocate memory, either by checking the pointer value (ifnothrow) or by catching the proper exception.

Dynamic memory in C

C++ integrates the operators new and delete for allocating dynamic memory. But these were not available in the C language; instead, it used a library solution, with the functionsmalloc,calloc,reallocandfree, defined in the header<cstdlib>(known as<stdlib.h>in C). The functions are also available in C++ and can also be used to allocate and deallocate dynamic memory.

Note, though, that the memory blocks allocated by these functions are not necessarily compatible with those returned by new, so they should not be mixed; each one should be handled with its own set of functions or operators.