The Linked list reference article from the English Wikipedia on 24-Jul-2004
(provided by Fixed Reference: snapshots of Wikipedia from wikipedia.org)

Linked list

For people who check facts
In computer science, a linked list is one of the simplest data structures used in computer programming. It consists of a sequence of nodes, each containing arbitrary data fields and one or two references ("links") pointing to the next and/or previous nodes. Linked lists permit insertion and removal of nodes at any point in the list in constant time, but do not allow random access.

Table of contents
1 Variants
2 Applications of linked lists
3 Tradeoffs
4 Linked List Operations
5 Language Support
6 Generic Linked Lists
7 Linked Lists On The Web

Variants

Singly-linked list

The simplest kind of linked list is a singly linked list, which has one link per node. This link points to the next node in the list, or to a null value if it is the last node.

Image:Singly_linked_list.png
A singly linked list containing three integer values

To specify a singly-linked list (e.g. when handling it over to a procedure) it suffices to specify the address of its first element.

Doubly-linked list

A more sophisticated kind of linked list is a doubly linked list. Each node has two links, one to the previous node and one to the next.

Image:doublylinkedlist.png
An example of a doubly linked list.

Xor-linking is an elegant alternative to doubly-linked lists, where the two pointers fields are replaced by a single field.

Circular list

In a circularly linked list, the first and last nodes are linked together. This can be done for both singly and doubly linked lists. To traverse a circular linked list, you begin at any node and follow the list in either direction until you return to the original node.

Singly linked lists are often handled by giving the address of the last element — because that element also provides quick access to the first one, whereas the converse is not true.

Sentinel nodes

Linked lists sometimes have a special dummy or sentinel node at the beginning and/or at the end of the list, which is not used to store data. Its purpose is to simplify or speed up some operations, by ensuring that every data node always has a previous and/or next node, and that every list (even one that contains no data elements) always has a "first" and "last" node.

Applications of linked lists

Linked lists are used as a building block for many other data structures, such as stacks, queues and their variations.

The "data" field of a node can be another linked list. By this device, one can emulate arbitrary linked data structures with lists — a principle which is fundamental to the Lisp programming language, where linked lists are the primary (if not the only) data structuring mechanism.

Sometimes, linked lists are used to implement associative arrays, and are in this context called association lists. There is very little good to be said about this use of linked lists; they are easily outperformed by other data structures such as self-balancing binary search trees even on small data sets (see the discussion in associative array). However, sometimes a linked list is dynamically created out of a subset of nodes in such a tree, and used to more efficiently traverse that set.

Tradeoffs

As for most choices in computer programming and design, no method is well suited to all circumstances. A linked list data structure might work well in one case, and cause problems in another. This is a list of some of the common tradeoffs involving linked list structures. In general, if you have a dynamic collection, where elements are frequently being added and deleted, and the location of new elements added to the list is significant, then benefits of a linked list increase.

Linked lists vs. arrays

Linked lists have several advantages over arrays. For one thing, a linked list allows a new element to be inserted or deleted in any position, with a constant number of operations (changing a couple of pointers). Linked lists can also save memory in applications that need one or more lists of unpredictable size, since the total space used will be proportional to the number of elements actually present in all lists (as opposed to the number of lists times the maximum size of each list).

Further memory savings can be achieved, in certain cases, by sharing the same tail elements among two or more lists. In this way, one can add new elements to the front of the list while keeping a reference to both the new and the old versions — which provides the simplest example of a persistent data structure.

On the other hand, compared to an array, the chief disadvantage of a singly linked list is that it only allows sequential access to elements, and only in one direction (from front to back). In particular, when removing or inserting an element in the middle of the list, one needs to know the address of the preceding element.

Another disadvantage of linked lists is the extra storage needed for the address fields, which often makes them impractical for lists of small data items such as characters or boolean values. Even more significant is the time cost of allocating memory separately for each new element (and, in some systems, of locating and recycling elements that cannot be reached; see garbage collection). There are a number of techniques to reduce this overhead, such as using a free list or memory pool, storing the data itself inside the nodes, and CDR coding.

Double-linking vs. single-linking

Double-linked lists require more space per node (unless one uses xor-linking), and their elementary operations are more expensive; but they are often easier to manipulate because they allow sequential access to the list in both directions. In particular, one can insert or delete a node, in a constant number of operations, given only that node's address. On the other hand, they do not allow tail-sharing, and cannot be used as persistent data structures.

Circular lists

Circular linked lists are most useful for describing naturally circular structures, and have the advantage of regular structure and being able to traverse the list starting at any point. They also allow quick access to the first and last records through a single pointer (the address of the last element). Their main disadvantage is the complexity of iteration, which has subtle special cases.

Sentinel nodes

Sentinel nodes may simplify certain list operations, by ensuring that the next and/or previous nodes exist for every element. However sentinel nodes use up extra space (especially in applications that use many short lists), and they may complicate other operations.

Linked List Operations

When manipulating linked lists in-place, care must be taken to not use values that you have invalidated in previous assignments. This makes algorithms for inserting or deleting linked list nodes somewhat subtle. This section gives pseudocode for adding or removing nodes from singly, doubly, and circularly linked lists in-place. Throughout we will use null to refer to an end-of-list marker or sentinel, which may be implemented in a number of ways.

Singly-linked lists

Our node data structure will have two fields:

We also keep a variable head which always points to the first node in the list, or is null for an empty list. Traversal of a singly-linked list is easy:
node := head
while node not null
   
   node := node.next
The following code inserts a node after an existing node in a singly linked list. Inserting a node before an existing one cannot be done; instead, you have to locate it while keeping track of the previous node.
insertAfter(node, newNode)
   newNode.next := node.next
   node.next    := newNode
Inserting at the beginning of the list requires a separate function. This requires updating head.
insertBeginning(newNode)
   newNode.next := head
   head         := newNode
Similarly, we have functions for removing the node after a given node, and for removing a node from the beginning of the list. To find and remove a particular node, one must again keep track of the previous element.
removeAfter(node)
   nodeBeingRemoved := node.next
   node.next := node.next.next
   destroy nodeBeingRemoved

removeBeginning(node)
   nodeBeingRemoved := head
   head := head.next
   destroy nodeBeingRemoved
Notice that removeBeginning() sets head to null when removing the last node in the list.

Doubly-linked lists

With doubly-linked lists there are even more pointers to update, but also less information is needed, since we can use backwards pointers to observe preceding elements in the list. This enables new operations, and eliminates special-case functions. We will add a prev field to our nodes, pointing to the previous element, and a tail variable which always points to the last node in the list. Both head and tail are null for an empty list.

Iterating through a doubly linked list can be done in either direction. In fact, direction can change many times, if desired.

Forwards

node := head
while node not null
   
   node := node.next
Backwards
node := tail
while node not null
   
   node := node.prev

These symmetric functions add a node either after or before a given node:

insertAfter(node,newNode)
   newNode.prev := node
   newNode.next := node.next
   if node.next is null
       tail := newNode
   else
       node.next.prev := newNode
   node.next    := newNode

insertBefore(node,newNode)
   newNode.prev := node.prev
   newNode.next := node
   if node.prev is null
       head := newNode
   else
       node.prev.next := newNode
   node.prev    := newNode
We also need a function to insert a node at the beginning of a possibly-empty list:
insertBeginning(newNode)
   if head is null
       head := newNode
       tail := newNode
       newNode.prev := null
       newNode.next := null
   else
       insertBefore(head, newNode)
A symmetric function inserts at the end:
insertEnd(newNode)
   if tail is null
       insertBeginning(newNode)
   else
       insertAfter(tail, newNode)
Removing a node is easier, only requiring care with head and tail:
remove(node)
   if node.prev is null
       head = node.next
   else
       node.prev.next = node.next
   if node.next is null
       tail := node.prev
   else
       node.next.prev = node.prev
   destroy node
One subtle consequence of this procedure is that deleting the last element of a list sets both head and tail to null.

A curious way of implementing doubly linked lists that uses half as much space for pointers is the xor linked list.

Circularly-linked lists

Circularly-linked lists can be either singly or doubly linked. For both, their circular versions, which link all the nodes into a continuous circle without using null, benefit from the ability to traverse the full list beginning at any given node. This often allows us to avoid storing head and tail, although if the list may be empty we need a special representation for the empty list, such as a head variable which points to some node in the list or is null if it's empty; we use such a head here.

This representation significantly simplifies adding and removing nodes into a non-empty list, but empty lists are a special case. Assuming that someNode is some node in the list, this code iterates through that list starting with someNode:

Forwards

node := someNode
do
   
   node := node.next
while node not someNode

Backwards

node := someNode
do
   
   node := node.prev
while node not someNode

Notice the postponing of the test to the end of the loop. This is important for the case where the list contains only the single node someNode.

This simple function inserts a node into a doubly-linked circularly linked list after a given element:

insertAfter(node, newNode)
   newNode.next := node.next
   newNode.prev := node
   node.next.prev := newNode
   node.next      := newNode

Inserting an element in a possibly empty list requires a special function:

insertBeginning(node)
   if head is null
       node.prev := node
       node.next := node
   else
       insertAfter(head.prev, node)
   head := node

Finally, removing a node must deal with the case where the list empties:

remove(node)
   if node.next is node
       head := null
   else
       node.next.prev := node.prev
       node.prev.next := node.next
   destroy node

Language Support

Many programming languages, such as Lisp and Scheme have singly linked lists built in. In other languages it is simple to create the data structure using references. For example, here is a complete example in C which adds the numbers 1 through n using a linked list:

struct node {
    int value;
    struct node *next; /* next node in list */
};

int add_up_to(int n) {
    struct node first_node; /* first item in the linked list */
    struct node *p = &first_node;
    int i, sum;
    p->next = NULL;

    /* Add the numbers 1 through n to the linked list */
    for(i=1; i <= n; i++) {
        struct node *new_node =
            (struct node *)malloc(sizeof(struct node));

        p->value = i;

        new_node->next = NULL;

        p->next = new_node;
        p = new_node;
    }

    /* Iterate through the list, summing up the values */
    sum = 0;
    for(p = &first_node; p != NULL; p = p->next) {
        sum = sum + p->value;
    }

    return sum;
}

Warning: This is just an example. This is not the best way to add the numbers 1 through n, and the memory allocated for the list nodes is not being freed, resulting in a memory leak. Take caution.

Generic Linked Lists

The above examples of linked lists use the more common method of embedding the "next" pointer within the data structure. This is the simplest way to implement a linked list, especially if only one linked list will be implemented within an application.

One drawback with this approach is that the routines to insert, delete, and search the linked list must be written for each linked list. Since these operations are very similar, this results in duplicate effort if more than one linked list are implemented, or a linked list has more than one link (e.g., linked lists to sort by name and ID fields.)

If extensive use of linked lists are anticipated, then there may be an advantage to implementing a generic linked list package that can be used to create linked lists of possibly different items. This approach is similar to how some LISP implementations handle linked lists.

Within LISP, each node of a linked list contains two pointers:

This approach takes more effort to implement, but is a more powerful linked list, since one implementation can be used to handle several lists. Each list could handle a different type of element.

Here is an example of a generic linked list data structure:

typedef struct node {
 void *car; /* pointer to data associated with node */
 struct node *cdr; /* pointer to the next node */
} Node;

Since car is a void pointer, any type of data can be linked, although the programmer must cast the datum into the proper data type.

Suppose we have two different structures we want to store in linked lists:

struct foo {
 char *name;
 char *addr;
 char *phone;
}

struct bar {
 int id;
 char *name;
}

We can set up two linked lists for these structures:

Node *listFoo = NULL;
Node *listBar = NULL;

We can also create generic add and delete routines. These routines assume that all parameters are valid so as not to clutter the example with error handling code. Production code would test for and handle invalid arguments.

void insert_node(Node *prev, void *item)
{ /* add item to list after prev node */
  Node *n; /* will point to new node to be added */
  n = (Node *)malloc(sizeof(struct node)); /* new node */
  n->car = item; /* point to item */
  n->cdr = prev->next; /* point to rest of list */
  prev->next = n; /* add this node to the list */
}

void *delete_node(Node *prev)
{ /* delete item following prev node, return deleted item */
  void *ret; /* will point to item from deleted node */
  Node *t; /* temp node pointer for cell to be deleted */
  t = prev->next; /* node that will be deleted */
  ret = t->car; /* this is the item to be removed from list */
  prev->next = t->next; /* point around node to be deleted */
  free(t); /* free deleted node */
  return ret;  /* return item that was removed */
}

To insert a foo structure f and bar structure b to their respective lists, we would write:

struct foo *f; /* the foo structure we want to insert */
struct bar *b; /* the bar structure we want to insert */
Node *foo_node; /* node used to search the foo list */
Node *bar_node; /* node used to search the bar list */
[statements]
insert_node(foo_node, f);
insert_node(bar_node, b);

To delete element after the specified nodes, we could write the following code. Remember that the delete routine returns the data associated with the deleted node. This is useful so that the caller can free up any allocated memory if needed.

f = (struct foo *)delete_node(foo_node);
b = (struct bar *)delete_node(bar_node);

There are a number of advantages of this approach over the more traditional implementation. In addition to less code, it is also possible to have more than one linked list containing the same data elements, since including an element within the linked list doesn't involve the actual data element. For example, if you want to link the bar structure by id and name fields using the traditional approach, you would need two next pointers within the structure (e.g., next_id and next_name). If using the generic linked lists mentioned here, the programmer need only write compare functions for each list.

The major drawback of this approach is that it takes some additional effort to handle the data elements. For example, to examine a data element, the caller must provide a function to access the item. This function will be used by the generic routine to compare the current node with the item being searched for.

Here is a generic search function that returns the node pointing to the specified item. It assumes the compare function returns values similar to strcmp:

Node *find_item(Node *list, void *item, int (*cmp_function)(void *item1, void *item2))
{
 Node *t; /* temp node for walking through the list */
 for (t=list; t!=NULL; t=t->next)
 { /* loop through list */
  if (find_function(t->car, item)==0) return t; /* if found, return it */
 }
 return NULL; /* not found */
}

The following compare functions could be used for the foo and bar structures, assuming we are searching on the name and id fields:

int cmp_foo(void *item1, void *item2)
{
 struct foo *f1, *f2;
 f1=(struct foo *)item1; /* first structure */
 f2=(struct foo *)item2; /* second structure */
 return strcmp(f1->name, f2->name);
}

int cmp_bar(void *item1, void *item2)
{
 struct bar *b1, *b2;
 b1 = (struct bar *)item1; /* first structure */
 b2 = (struct bar *)item2; /* second structure */
 return b1->id - b2->id;
}

The following code could be used to find items in the respective linked lists:

struct foo *f; /* foo structure to find */
struct bar *b; /* bar structure to find */
char *foo_name; /* will be set to foo name we are searching for */
int bar_id; /* will be set to id field we are looking for */
[statements]
f = (struct foo *)find_item(listFoo, foo_name, cmp_foo); 
b = (struct bar *)find_item(listBar, bar_id, cmp_bar);


A free generic linked list package called LinkList written by Bill Pringle  
is available here.

Variations on Generic Linked Lists

A number additional structures can be created to assist when working with generic linked list. In many cases, care must be taken to update these auxiliary structures when the linked list changes.

Double Linked List

To change the above generic linked list structure from a single to double linked list, we must add a backward pointer, and slightly modify some of the routines that deal with the lists.

typedef struct node {
 void *car; /* pointer to data associated with node */
 struct node *cdr; /* pointer to the next node */
 struct node *prev; /* backwards pointer to previous node */
} Node;

void insert_node(Node *prev, void *item)
{ /* add item to list after prev node */
  Node *n; /* will point to new node to be added */
  n = (Node *)malloc(sizeof(struct node)); /* new node */
  n->car = item; /* point to item */
  n->cdr = prev->next; /* point to rest of list */
  n->prev = prev; /* point back to previous node */
  n->cdr->prev = n; /* next node points back to this node */
  prev->next = n; /* add this node to the list */
}

void *delete_node(Node *prev)
{ /* delete item following prev node, return deleted item */
  void *ret; /* will point to item from deleted node */
  Node *t; /* temp node pointer for cell to be deleted */
  t = prev->next; /* node that will be deleted */
  ret = t->car; /* this is the item to be removed from list */
  prev->next = t->next; /* point around node to be deleted */
  prev->next->prev = prev; /* point backward around node to be deleted */
  free(t); /* free deleted node */
  return ret;  /* return item that was removed */
}

Routines that traverse the list (such as the find_item method above) can move backwards through the list using the prev pointer.

Linked List Data Structure

When a linked list is defined by the node structure shown above, if an item may be inserted at the front of the list, it is necessary to pass a pointer to the node. This can get messy, and is a common source of confusion with beginning linked list programmers. This can be simplified by creating a structure that defines the linked list:

typedef linklist {
  Node *head; /* pointer to first element in the list */
  Node *tail; /* pointer to the last element in the list */
  Node *current; /* pointer to the current element in the list */
} LinkedList

Only the head pointer is needed in the above structure, but the other pointers can be useful when working with lists. Maintaining a pointer to the tail of the list makes appending items to the end of a list easier, since there is no need to traverse the list to find the last entry (see below). The concept of the current element in a list can be useful if you are processing all the elements in a list, such as a tokenizer that returns the next element in a list each time it is called.

Here is a sample routine to append an item to the end of a linked list:

void append_item(LinkedList *list, void *item)
{ /* append the specified item to the end of the linked list */
 Node *n; /* temp node for new item */
 n = (Node *)malloc(sizeof(Node)); /* create node */
 n->car = item; /* point to item */
 n->cdr = NULL; /* last node in list */
 n->prev = list->tail; /* point back to former tail element */
 list->tail->next = n; /* add new node to after former tail */
 list->tail = n; /* new tail on list */
}

Skip Lists

A skip list is a method of quickly traversing a long linked list by creating a super-structure list that points to nodes that are far apart. You can think of a skip list as similar to an expressway where an express lane and a local exit lane runs parallel to each other. Cars in the express lane must move to the local exit lane when near their exit.

A skip list can be implemented using the generic node structure from above. To construct a linked list, we create a new linked list that points to every n nodes:

LinkedList *longList; /* this is the long list we want to build a skip list for */
LinkedList *skipList; /* this is the skip list we will build */
int skip_interval; /* will be set to the initial span between nodes in skip list */
Node *n; /* node for traversing the long linked list */ 
void *item; /* new node for the skip list */
[statements]
for (n=longList->head; n != NULL; )
{
 s = (Node *)malloc(sizeof(Node)); /* create node for skip list */
 item = n; /* point to current node in long linked list */
 append_item(skipList, item); /* add the element to the end of the skip list */
 for (i=0; n != NULL && i < skip_interval; i++) n=n->next;
}

Now, instead of searching the long linked list for an item, the programmer can search the skip list. If the item pointed to by the next cell in the skip list is less than the item you are looking for, you change your search to the actual linked list. This reduces the number of cells that are examined, since you look every 100 cells until you get close to the item you are looking for.

Node *find_skip(LinkedList *skiplist, void *item, int (*cmp_function)(void *item1, void *item2))
{ /* return the node containing the specified item */
 Node *n; /* node for searching the linked list */
 for (n=skiplist; n!=NULL; n=n->next)
 { /* see if it is time to switch to the actual linked list */
  if (n->cdr!=NULL && cmp_function(n->cdr->car, item) < 0)
  { /* switch to the actual linked list */
   return find_item(n->car, item, cmp_function); /* find item in linked list */
  }
 } /* end loop through skip list */
 return NULL; /* not found */
}

Partial Index for Linked List

Another method of reducing the time required to search a linked list is to build an index for the list. Most likely, this will be a partial index that will indicate where to begin searching the list when looking for a certain value.

For example, an index for the above bar list could contain a pointer to nodes based on id values of 0, 100, 200, etc., while an index for the above foo list could contain where to start looking for nodes with names starting with "A", "B", "C", etc.

Here is a sample index for the foo structure based on the first letter of the name:

Node *fooIndex[26]; /* index based on first letter */

The above index must be initialized to point to the first node with a name that starts with the given letter. If no such entry exists, the pointer could be NULL or point to the first node before or after that letter.

Node *index_search_foo(char *name)
{ /* find entry for given name */
 Node *n; /* node for searching the foo list */
 int ndx; /* offset into index to start search */
 ndx = 'A' - toupper(*name); /* offset based on first letter */
 return find_item(fooIndex[ndx], name, cmp_foo);
}

Linked Lists On The Web

Some linked list materials are available from the Stanford Computer Science department: