This article assumes the reader has a basic idea of bits and bytes.
Also absolute beginners may want to have a look at the prelude first :
Basic Programming Concepts for Beginners
Before going deep into pointers, let's discuss what pointers are and why they are necessary.
What are pointers?
All languages have variables, whether you explicitly declare them or not.
In C (and C++), variables have to be explicitly declared along with their datatypes (character, integer etc).
For example, an expression of the form int X = 10, means X is an integer type variable which takes 2 bytes in memory (in some cases 4 based on compiler type), holding the value 10.
Figure - a
Here 1007 and 1008 is used to denote the respective addresses of two consecutive memory spaces used by integer X to hold the value 10.
A pointer is simply a variable which is able to hold such addresses (1007, 1008) and access the values inside the address it holds.
(Mind you, the addresses might not be such nice numbers)
So in simple words -
a pointer is a variable which points to memory locations/addresses of other variables.
For example, if int X is an integer variable, then a pointer keeps the address of (or points to) the first byte address of variable X (1007 in the case of Figure-a).
Why Pointers?
For many type of applications using/manipulating/doing operations with variables is enough, but when speed is a major factor and you need to get the best performance out of hardware, you go for pointers.
Pointers give low level languages like C/C++ extraordinary power such as manipulating data, applying effects to images at pace, do quick manipulation of matrices, access OS allocated memory, video buffer etc.
Pointers can be used to walk/pass through consecutive memory locations with great speed and while doing the walk - it can check and manipulate data super fast . It can be used for passing data to other procedures/functions quickly. For image/video processing, rendering, applying effects - performance of pointers are unmatched. Compared to array indexing, pointers are capable of showing much better performance in many situations. So, in low level programming (such as C and assembly), knowledge of memory and pointers is a must.
But before getting good grasp of pointers, it is better to have a basic concept of memory layout of RAM.
Stack, Heap and Pointers ..
Memory is conceptually divided into two segments : Stack and Heap.
Below is an image of the conceptual representation of memory and variables
Stack is used to declare variables and arrays when we know before hand the amount of memory needed. It's very easy to declare variables in stack. For example, just write int n and a space of 2 bytes (or 4) is allocated immediately.
Declare int array[100] and immediately a consecutive memory space of required to hold 100 integers is allocated.
Variables are stacked over one another as we keep declaring variables in Stack.
The case of Heap is different. In Heap, the operating system manages the memory and we request the operating system to allocated a chunk of memory for our usage. Then the OS decides where to create a block in memory inside the heap.
In C we usually use malloc for this and in C++ the new keyword which does it more elegantly.
Usually when memory allocation is dependent upon user input or based on certain circumstances not foreseeable before running the app, heap allocation is needed.
If we know beforehand there are 500 seats in an aero plane,
then we can declare int seats[500] in the Stack to hold seat related data.
Seat related data can be as simple as 1 or 0 to denote whether the seat is booked or not.
However, if we don't know how many passengers will board the plane on a certain day, and we are dependant upon an operator to input that info before the flight, then we use malloc to reserve adequate space in the Heap to hold passenger data.
In case of variables in Stack, such as "int x", to access the value contained in x, we simply use x (ex. y = x + 1).
In-order to know the memory address of x, we use &x. &x returns the first byte address of the two consecutive bytes in memory reserved for an integer.
But simply writing &x just gives us an address, but to work with the address we need to store it somewhere and hence comes the need of pointers.
A pointer is a variable big enough to hold a memory address.
To differentiate between normal variables and pointers we use a * during declaration, for example
char* p or int* p.
You could write it like char *p, int *p but I personally prefer to write it as char* and int * which explicitly associates the pointer to the datatype.
As discussed earlier, we need pointer along with pointer arithmetics to go over memory addresses fast, accessing data quickly if necessary.
if p is a pointer pointing to a memory address, *p is used to access the value inside that memory address. This is a part that confuses young programmers because they mix it up with the declaration of pointer p.
While declaring we are using a * and then again while accessing values inside an address we are accessing a * as well.
I would recommend to conceptualise it in the following manner :
1) A pointer type data is declared with * ; so once we write int* p or int *p for the first time, a memory space is created which is big enough to hold a memory address (usually 8 to 16 bytes depending on the nature of the hardware).
2) After the declaration, as long as we are assigning to the pointer some memory address or doing pointer arithmetic with the pointer we simply use the pointer variable name (ex. p = q + 1)
3) A normal variable like int x has two properties: one is the value which we access by variable name, the other other is it's memory location which we access by &x;
To store the address of the variable (which is &x) we need a pointer which we have already declared like int* p;
So we write p = &x which means:
*store address of x (i.e &x) to the pointer p which has been declared earlier.
4) Later on if we need to know what is the value stored inside the memory addressed pointed by p, we use *p.
p contains a memory address, *p access the value inside the memory address.
On a different note,
when you see a statement like int* p = &x, we are actually merging two statements in one, namely
i) int* p;
ii) p = &x;
As per above image (Figure - b) you can see two cases (Case 1 and Case 2).
Interesting thing is - a pointer doesn't actually need a char/integer datatype to be associated with, you could just write void* p = (void*) &x;
We specify the datatype while declaring in order to make the compiler understand how the pointer should act when you try to operate on it or with it.
In Case 1, declaring a character pointer (i.e char* p) means the pointer operation is based on bytes, so if you write something p++, p would point to the next byte. If you write char* q = p + 1, q is pointing to the memory address immediately one byte after p.
For Case 2, int* p = &x and p++, p skips two bytes. In case of int* q = p + 1, q would skip two bytes (the size of an integer) and point to the next block of address after the integer p.
Therefore with pointers, we can move around in memory with abundance of freedom, but do remember with great freedom comes great responsibility.
Now what about Arrays and their pointers?
An array is a block of consecutive bytes of memory.
The concept of pointer to array is nearly same for a normal variable like int x but a bit of explanation might help :
When we are dealing with pointer for an array, the technique is to assign the pointer to the head of the array (i.e the address of the first item of the array).
If we declare an array in stack with int myArray[3], the array's first element is Array[0] and the address of Array[0] can be retrieved by the same technique applied for a normal variable.
Remember we assigned the address of int x to pointer p as:
p = &x; therefore for myArray[0], we do it the same way
p = &myArray[0];
For convenience and better reading, we have a provision to write &myArray[0] as myArray.
So instead of p = &myArray[0], we can write p = myArray;
Some beginners think myArray is also a pointer variable since we can assign to a pointer directly. It is not, myArray is just a convenient way to rewrite &myArray[0].
You can't do myArray++ or myArray = Array1 (where Array1 is some other Array variable).
Just like &x + 1 would fetch the next address skipping x, myArray + 1 would point to the next address skipping myArray[0]. (So myArray + 1 is namely the address of myArray[1])
At best we can say:
if we have an Array called myArray declared in stack with int myArray[100], then myArray denotes the first element memory address (i.e &myArray[0]) as well as being the name for a stack variable myArray, occupying a chunk of space in memory.
Hopefully, the following image (Figure - c) clarifies the whole concept behind array a bit and Case 1 and Case 2 makes sense when compared to cases 1 and 2 of Figure b:
Now that we covered a bit of Stack, let's turn our attention to Heap.
Variables in Heap are allocated through system calls like malloc and new (new is available for C++ only).
Since the operating system chooses the location in memory, the only way to get access to that memory is through pointers.
So a function like malloc takes the number of consecutive bytes to allocate and returns an address/location to the allocated memory which we save in a pointer for later access.
Example,
int numOfPassengers = 10;
int* p = malloc(numberOfPassengers * sizeof(int));
p[0] = 1 (or *p = 1)
p[1] = 1 (or *(p+1) = 1)
Here, since we are planning to allocated an array of integers, therefore through sizeof we are getting the size of an int (which is usually 2 or 4 depending on the compiler) first. Then multiplying sizeof(int) by the intended number of integers (e.g 10), we get the necessary allocation size for 10 integers.
(In C++ things look a bit clean like int* p = new int [10];)
When we are done working with that memory and the values contained inside the chunk, we have to free the memory by a call to free (ex. free(p)), otherwise the program will leak memory and might be shut down by the system.
Below is the diagram to further visualise the Heap allocation from points a to d.
Step a represents the need of allocating a chunk laid out consecutively in memory which is basically an array (the expression A[dynamically 10] is just a reflection of our thought process that we need a dynamic array) .
Step b tells us we need a pointer to hold that array
we haven't done anything to allocated memory yet, it's not part of the C code, just explaining the thought process before taking action.
Step c is the actual C code which does what we discussed in steps a and b, i.e malloc(10*sizeof(int)) does the work of declaring int A[10] dynamically and pointer int* a points to that array.
Step d is about freeing the memory after work done.
Hope, this helps to eliminate a lot of confusions faced by beginners who plan to start a thrilling career with C/C++ going deep into complex systems.
signing off,
Mukit
p.s : Task for the beginner - try to analyse what's going on in the cover image (ok int **ptr should be named as int **doubleptr).
Hint : A pointer is a variable, which can be pointed by another pointer