Stack & Heap Memory
Memory in a typical modern application can be stored in 2 areas, the stack and the heap.
The Stack
A stack is a data structure that adds and removes elements in a last in first out (LIFO) manner. To add an element to the stack is to push one on and to remove is to pop. A common anology is a spring-loaded stack of plates in a cafeteria - clean plates can be put on top of the stack, pushing down those there. By removing a plate from the top of the stack, the plates remaining pop up.
When we talk about ‘the stack’ in memory, we refer to a region of memory that follows this data structure and stores all temporary variables. Every time a function declares a variable, it gets pushed to the stack and whenever the function completes, all of the variables within that function get freed.
The great thing about the stack is that the memory is managed for you so we do not have to allocate memory, nor free it, manually.
Another reason why the stack is great is that it is really fast due to the fact the allocated memory is contiguous. Whenever variables are pushed and freed, a pointer is also positioned so the processor knows where the next allocation should take place.
However the issue with the stack is that you must know the size of the data at compile time, which of course is not always known, especially with collection data types.
Another issue with the stack is that memory allocation is done on initialisation and is usually a very small space allocated (typically a single megabyte). This is because it must be contiguous to operate correctly and it may be difficult to find such a block of memory. As there is very limited space, the problem is that you can quickly run into a stack overflow error, being that the stack memory size has exceeded the allocated size. A common example of this is a recursive function that recurses too many times.
Finally, in multi-threaded applications, each thread has it’s own stack and cannot access any other thread, even the calling thread’s stack.
The Heap
Unlike the stack, heap memory is dynamic memory. It is intended for memory of size that is unknown at compilation due to external factors like I/O events.
Heap memory requires more management because it is not freed automatically. Requiring us to keep track of when it is no longer needed and free it ourselves with functions like C’s aptly named free
function. If we are not careful in how we manage memory it can be quite easy to introduce a memory leak - a state in a program where it fails to free memory we no longer need.
As managing this can can be challenging and tedious, many high level languages introduce a garbage collector such as Javascript, Go and C#. Garbage collection is a way for a program to keep track of heap allocations and freeing automatically so the developer does not have to. The garbage collector will occasionally run it’s algorithm (typically mark and sweep) to detect and free unreferenced memory. However without the developer controlling when this is run, it can create performance issues and is why Discord recently transitioned a part of their stack from Go to Rust. Note that it is still possible to introduce a memory leak in a GC-language, although alot harder to do.
Not only is heap memory harder to manage, but it is also slower because unlike the stack, the memory is not contigous. This means that the processor needs to do more work to find or allocate memory.
The heap is the only option when you want multiple threads to access the same data, it will require knowledge of concurrent programming like data races and dead locking to be used successfully.