Introduction
So far, each variable stores a single value. But what if you want to store 10 grades, 100 measurements, or the letters of a text? Declaring int grade1, grade2, grade3, ... would be unmanageable.
An array is a series of variables of the same type, stored side by side in memory, accessible by a number (the index). Instead of 10 separate variables, you declare a single array of 10 slots.
Declaring an array
| 1 | int tab[5]; |
This creates 5 int values side by side in memory. Each slot is accessible by its index.
In memory
tab: +------+------+------+------+------+
| ? | ? | ? | ? | ? |
+------+------+------+------+------+
index: [0] [1] [2] [3] [4]The 5 slots are contiguous -- no gaps between them. Each slot is sizeof(int) = 4 bytes. The entire array takes up 5 x 4 = 20 bytes.
The size must be constant
The array size must be known at compile time -- not a variable:
| 1 | int tab[5]; // OK -- 5 is a constant |
| 2 | int tab[size]; // FORBIDDEN at 42 (VLA) |
For dynamic sizes, you'll need malloc (course on malloc & free).
Without initialization
| 1 | int tab[5]; |
| 2 | // The contents of tab[0], tab[1], etc. are GARBAGE |
Just like simple variables, an uninitialized array contains random values.
Accessing elements
Reading and writing
| 1 | int tab[5]; |
| 2 | |
| 3 | tab[0] = 10; // Write to the first slot |
| 4 | tab[1] = 20; |
| 5 | tab[4] = 50; // The last slot (index 4, not 5) |
| 6 | |
| 7 | int val; |
| 8 | |
| 9 | val = tab[0]; // Read the first slot -> val = 10 |
Iterating with a loop
This is the most common pattern in C:
| 1 | int tab[5]; |
| 2 | int i; |
| 3 | |
| 4 | i = 0; |
| 5 | while (i < 5) |
| 6 | { |
| 7 | tab[i] = i * 10; // tab = {0, 10, 20, 30, 40} |
| 8 | i++; |
| 9 | } |
To read:
| 1 | i = 0; |
| 2 | while (i < 5) |
| 3 | { |
| 4 | // do something with tab[i] |
| 5 | i++; |
| 6 | } |
Arrays and pointers -- The relationship
The array name is an address
In C, the name of an array is automatically converted to a pointer to its first element:
| 1 | int tab[5]; |
| 2 | int *p; |
| 3 | |
| 4 | p = tab; // p points to tab[0] |
tab[i] is shorthand
The expression tab[i] is actually *(tab + i) -- "take the address of tab, advance by i elements, and read the value". The compiler does this calculation automatically.
That's why indexes start at 0: tab[0] = *(tab + 0) = the value at the array's starting address, without advancing.
Passing an array to a function
When you pass an array to a function, you pass a pointer. The function can modify the original array:
| 1 | void ft_fill(int *tab, int size, int value) |
| 2 | { |
| 3 | int i; |
| 4 | |
| 5 | i = 0; |
| 6 | while (i < size) |
| 7 | { |
| 8 | tab[i] = value; |
| 9 | i++; |
| 10 | } |
| 11 | } |
| 12 | |
| 13 | int main(void) |
| 14 | { |
| 15 | int grades[5]; |
| 16 | |
| 17 | ft_fill(grades, 5, 0); |
| 18 | // grades = {0, 0, 0, 0, 0} |
| 19 | return (0); |
| 20 | } |
Two important points:
Common array operations
Finding the maximum
The principle: take the first element as the provisional maximum, then compare each subsequent element. If an element is larger, it becomes the new maximum.
Reversing an array
The principle: swap symmetric elements -- the first with the last, the second with the second-to-last, etc. Stop at the middle.
Before: [10, 20, 30, 40, 50]
swap(10, 50) -> [50, 20, 30, 40, 10]
swap(20, 40) -> [50, 40, 30, 20, 10]
middle reached -> done
After: [50, 40, 30, 20, 10]Sorting an array
The simplest sort is selection sort: for each position, find the smallest element in the rest of the array and swap it with the current position. Two nested loops.
[50, 30, 10, 40, 20]
Position 0: min in whole array = 10 -> swap with 50 -> [10, 30, 50, 40, 20]
Position 1: min in the rest = 20 -> swap with 30 -> [10, 20, 50, 40, 30]
Position 2: min in the rest = 30 -> swap with 50 -> [10, 20, 30, 40, 50]
Position 3: min in the rest = 40 -> no swap -> [10, 20, 30, 40, 50]Array limitations
No bounds checking
C never checks that your index is within bounds:
| 1 | int tab[5]; |
| 2 | |
| 3 | tab[5] = 99; // OUT OF BOUNDS -- the 6th element doesn't exist |
| 4 | tab[100] = 99; // OUT OF BOUNDS -- way too far |
| 5 | tab[-1] = 99; // OUT OF BOUNDS -- before the start |
The program doesn't necessarily crash -- that's worse. It might:
- Silently overwrite other variables
- Crash later in an unexpected place
- Appear to work (the most dangerous bug)
This is undefined behavior (the program can do anything). It's your responsibility to stay within bounds.
No copy by assignment
| 1 | int a[5]; |
| 2 | int b[5]; |
| 3 | |
| 4 | b = a; // Compilation ERROR |
To copy an array, you must copy element by element with a loop.
No returning a local array
| 1 | int *bad(void) |
| 2 | { |
| 3 | int tab[5]; |
| 4 | |
| 5 | return (tab); // DANGER -- tab is destroyed when the function ends |
| 6 | } |
The local array lives in the function's memory. When it ends, that memory is freed. The returned pointer points to nothing. To return an array, you'll need malloc (course on malloc & free).
Quiz
Test your knowledge — select the correct answer for each question
int tab[10]?tab (the array name) evaluate to when used in an expression?tab[10] in an int tab[5] array?int *tab and int tab[] in function parameters -- what's the difference?int b[5]; b = a; compile?int tab[5] array takes up how many bytes in memory?