------------------------------------------------ "Understanding Linked Lists" ------------------------------------------------ C/O :: arp of DynamicHell Development Team ------------------------------------------------ http://dynamichell.org | irc.dynamichell.org ------------------------------------------------ This tutorial is aimed at users with a basic understanding of C and, more importantly, pointers. This technique is portable and can be used on any operating system with a standard C library and compiler. There are many data structures available to a programmer. Perhaps the most common being the linked-list. A linked-list is exactly what it's description suggests: a collection of lists (data structures) all linked together. Linked-lists are linked data structures, joined together by pointers to the next member of a list (or if a double linked-list) by pointers to the previous and the next node in the list. Linked-lists are reasonably fast to traverse as all a process must do is access the pointer to the next node until the required node, or the end is found (to add another node). This is much faster than accessing an array of structures, incrementing the index, checking the structure, and repeating the process. Furthermore, linked-lists allow the programmer to forget the number of nodes in the data structure, unlike an array which is statically allocated. The following diagram shows a visual representation of a linked-list: Node : === Pointer : --- ZERO---===---===---===---===---===---===---===---NULL The root node is usually allocated but unused (possibly zeroed). The end of the list is indicated by a NULL pointer. The NULL pointer is important as it indicates to any program traversing the list its end; the program knows where the list ends and can act appropriately. Implementing a Linked-list (C) ============================== The traditional structure of a node (the instrument for holding our data) is as follows: struct node { char *data; struct node *next; /* Pointer to next node */ }; Of course we need a means of allocating our nodes for our dynamic linked-list. A simple function such as the following will suffice: struct node *node_alloc(void) { struct node *tmp = calloc(1, sizeof(struct node)); if(tmp == NULL) exit(EXIT_FAILURE); return tmp; } We also need to be able to add data to our newly created structs. A function which calls the above node_alloc and adds data to the list follows: void node_add_data(struct node *rootnode, char *data) { struct node *tmp; for(tmp = rootnode; tmp->next != NULL; tmp = tmp->next); /* Traverse the list until node->next is NULL which we will add to. When adding we do not traverse until node == NULL, otherwise we are at the end of the list and unable to connect to a previous node. */ if(tmp->next == NULL) { /* Sanity Check */ tmp->next = node_alloc(); /* Connect to the list */ tmp = tmp->next; tmp->data = strdup(data); tmp->next = NULL; /* Create the new end */ } else exit(EXIT_FAILURE); } With functions for creating nodes and adding data (nodes) to the list, we are now ready to write a function suitable for reading the data from the list. void display_list(struct node *rootnode) { struct node *tmp; for(tmp = rootnode; tmp != NULL; tmp = tmp->next) { if(tmp->data != 0) { /* For our root node, explanation later */ printf("Data is %s\n", tmp->data); } } } We are almost finished. The only thing left to do is to clean up once we have finished using our linked list. The following function is a typical example of how to free a linked list and its data. void free_list(struct node *rootnode) { struct node *tmp = rootnode; struct node *fwd; while(tmp != NULL) { fwd = tmp->next; free(tmp->data); free(tmp); tmp = fwd; } } Notice that throughout all of these examples we never actually modify rootnode directly. If we did this then we would have no pointer to our root node, and thus, our linked-list would break. Final Code ========== #include #include #include struct node { char *data; struct node *next; /* Pointer to next node */ }; void display_list(struct node *rootnode); struct node *node_alloc(void); void node_add_data(struct node *rootnode, char *data); void free_list(struct node *rootnode); int main(void) { struct node *rootnode = node_alloc(); /* Root node, which remains zeroed, and is not used to store data.` ` */ node_add_data(rootnode, "This is our data"); node_add_data(rootnode, "This is another data"); display_list(rootnode); free_list(rootnode); return EXIT_SUCCESS; } struct node *node_alloc(void) { struct node *tmp = calloc(1, sizeof(struct node)); if(tmp == NULL) { fprintf(stderr, "calloc()\n"); exit(EXIT_FAILURE); } return tmp; } void node_add_data(struct node *rootnode, char *data) { struct node *tmp; for(tmp = rootnode; tmp->next != NULL; tmp = tmp->next); /* Traverse the list until node->next is NULL which we will add to. When adding we do not traverse until tmp == NULL, otherwise we are at the end of the list and unable to connect to a previous node. */ if(tmp->next == NULL) { /* Sanity Check */ tmp->next = node_alloc(); /* Connect to the list */ tmp = tmp->next; tmp->data = strdup(data); tmp->next = NULL; /* Create the new end */ } else { fprintf(stderr, "Oh shit!\n"); exit(EXIT_FAILURE); } } void display_list(struct node *rootnode) { struct node *tmp; for(tmp = rootnode; tmp != NULL; tmp = tmp->next) { if(tmp->data != 0) { /* For our root node, explanation later */ printf("Data is %s\n", tmp->data); } } } void free_list(struct node *rootnode) { struct node *tmp; struct node *fwd; while(tmp != NULL) { fwd = tmp->next; free(tmp->data); free(tmp); tmp = fwd; } } Summary ======= Linked lists are relatively fast data structures. They are easy to implement and are used throughout the software industry. Examples include such high profile projects as the Linux Kernel. The root node of the list is never directly accessed, only pointers to it, so not to damage the data structure. The end of the list is indicated by a NULL pointer, and the programmer must remember to set it. Rather than traversing to the end of the list when adding, the programmer must move to the last node, so that the next node is still connected to the list. Moving to the NULL pointer would change the address of the pointer which the previous last node pointer to; breaking the list. However, when wanting to read all the nodes in the list it is perfectly acceptable to traverse the whole list, down to the NULL pointer. When printing data from the linked list you must remember to ignore data from the root node. An earlier example which tested whether the data was zero is an acceptable method. Copryright (C) 2006. Alastair Poole. Verbatim copying and distribution of this entire article are permitted worldwide, without royalty, in any medium, provided this notice, and the copyright notice, are preserved.