March, 1998 Learning to C

Part 8: Arrays, Characters, and Strings

Often in C programming you need to make use of a long list of variables of the same type to store data values. The standard way to do this would be to create a variable for each value you need to store. While this would work, it would be very inefficient and would require you to think of many different names for each of the variables. An easy way to do this would be to store the list of variables, all of the same type, into an array. An array is simply a list of variables of the same type, stored in a list and can be addressed by the same identifier with a unique index. You can create an array of any type, including types you define yourself, like structs which will be discussed in a future article.

Suppose you wanted to hold 10 integers in a list, you would define your array like this:

int my_array[10];

The number used is 10 because you want an array of 10 integers, however, arrays in C start with the index of 0, so the elements will be index 0 to 9, equaling 10 elements. To access a variable stored in the array, you use the identifier and the element index like this:

my_array[0] = 1;
my_array[9] = 10;

If you try to access my_array[10], the program will attempt to write to memory at the end of your array, and you could overwrite another variable, or part of the program, or crash you computer. The C programming language does not check for out of bounds requests, and will do exactly what you ask, so be careful to ensure your program never requests an element out of bounds.

Just like any other variable, always be sure to initialize the values in your array before you use them, or they could contain garbage data. It is easy to access array elements for use or stroage using a for loop and using the incrementing index as the array element index. Here is an example:

/* Array.c v1.1 (07/10/98) */
#include <stdio.h>

int main(int argc, char *argv[])
{
   int my_array[5], count;

   for(count=0;count <= 4;count++)
   {
       my_array[count] = count*2;
   }
   for(count=0;count <= 4; count++)
   {
      printf("Element %d is : %d\n", count, my_array[count]);
   }
   return 0;
}

This program declares an array of integers named my_array and a counter variable, then assigns to each element of the array the element index times 2. Then the program uses a second loop to display the contents of the arrary. The program's output would be:

Element 0 is : 0
Element 1 is : 2
Element 2 is : 4
Element 3 is : 6
Element 4 is : 8

In the printf() statement, the second %d place holder is replaced with the value stored in the appropriate element of the array.

An array can be multi-dimensional. This means that instead of a list of variables, labeled as 0 though then end of the array, it can be a two dimensional grid, or a three dimensional cube, or as many dimensions as you choose to give it. Anything more than 3 is usually more than necessary, but if you have a use for it and can easily maintain it, it is possible. To define a multi-dimensional array you would simply add an additional subscript onto the array identifier.

int my_2d_array[9][3];
int my_3d_array[9][18][2];

The first example declares a 2 dimensional array, 9 elements by 3 elements. Remember, the first element in an array in C is always element 0, so the last element is the upper bound minus 1. The second example is a three-dimensional array. The dimensions do not need to be equivalent, but they must be positive integers. It is usually easiest to think of arrays of multiple dimensions as a list (one dimension), a grid (two dimensional) or a cube (three dimensional). However, computer memory is always linear and the actual values are not stored in multiple dimensions but in a line.

For my_2d_array, the elements are allocated in memory like this:

[0][0]  [0][1]  [0][2]  [1][0]  [1][1]  [1][2]  [2][0]  [2][1]  [2][2] ...

For most people it is much easier to picture a two dimensional array as a grid, so this is how arrays are usually shown. The memory allocation information is only required when you are using pointer arithmetic and that will be described in detail in a future article.

The identifier of the array, without any subscript, is a memory pointer to the first element in the array. This is handy when addressing elements in sequence. This will be covered some when talking of character arrays and in more depth in the article on pointer arithmetic.

One of the most common types of arrays in C is an array of characters. There is no character string variable type in C like there is in BASIC and other high level languages, so a string is stored as an array of chararters, delimited, or terminated, with a null character, or '\0', at the end. Here is an example:

/* Simplestr.c v1.0 (07/10/98) */
#include <stdio.h>

int main(int argc, char *argv[])
{
   char my_string[256];

   my_string[0]='t';
   my_string[1]='e';
   my_string[2]='s';
   my_string[3]='t';
   my_string[4]='\0';
   printf("%s\n",my_string);
   return 0;
}

This program declares an array of 256 characters, 0 through 254. The program then assigns the character constants t, e, s and t to the first elements of the array, then adds the string delimiter, the null character, \0, to the end. Then the printf() function is used to display the string by using the pointer to the first element of the string (array). The %s placeholder tells printf() that it is looking for a string pointer. As mentioned earlier, the array identifier without subscripts is a memory pointer to the first element of the array and from the first element in the array to the element that contains the null character is the complete string. The output of this program is simply the word "test".

A character identifier actually refers to the ASCII, or American Standard Code for Information Interchange, value of the character. For example 'A' is equivalent to the integer 65. If you changed the above program's printf() statement to read:

printf("%d",my_string[0]);

The program would display the number 116 which is the ASCII equivalent of the character 't'. Character constants differ from string constants by the delimeters used and the fact that strings are followed by a null character. A string constant is delimited by double quotes while a character constant is delimited with single quotes.

There are much easier ways to assign strings to character arrays then the method used in the last example, but you cannot directly assign strings into arrays. Since there are so many different functions that deal with the manipulation of strings, this will be covered in a seperate article.

Arrays are very handy when accessing variables that are of the same type in a situation where you need a list or multi-dimensional grid of them to store differnt types of data. When using strings and string variables you will get a lot of practice using character arrays. It can't be emphasized enough, practice is the best way to fully understand the information given here, but arrays are only a numbered list of like typed variables.