Hash table
In computer science, a hash table is a data structure that implements an associative array. Like any associative array a hash table is used to store many key ⇒ value associations (this is a many to one relationship as the hash table is almost universally smaller than the number of keys).
A hash table maintains two arrays, one for keys, one for values (or possibly one array of (key, value) pairs - it doesn't really matter). The elements of these arrays are referred to as buckets. When required to find the associated value for a given key, the key is fed through a hash function to yield an integer (called the hash value). This integer is then the index to the associated value. Commonly the hash value is calculated using modular arithmetic, modulo a prime p (with a hash table of size p). This is known to reduce the frequency of hash collisions.
Hash tables offer insertions, lookups, and deletions in O(1) amortized time, with the caveat that hash collisions need to be dealt with. A collision happens when two or more different keys hash to the same integer (they have the same hash value). Techniques for dealing with collisions include probing, chaining, and rehashing. Note that probing and chaining are generally mutually exclusive.
Probing is, upon finding a collision, moving to the next free bucket in the table (or incrementing by some number other than 1). Chaining involves using each bucket as a pointer to another structure, such as an array, a linked list, or even another hash (preferably with a different size and/or hashing function).
Both probing and chaining can be problematic because, as more and more elements are added to the hash, the O(1) property is lost! Rehashing is a technique to minimize this effect: by increasing the size of the table and recomputing the hash values with respect to the new table size, the O(1) property can be maintained. Hash tables work faster the fewer occupied entries they have because there will be fewer collisions. If the ratio of occupied entries to total entries is kept below some fixed constant then the performance of hash tables will be reasonable.
If all of the keys that will be used are known ahead of time, you can create a perfect hash table, in which all probes are guaranteed to be O(1), and is called perfect hashing. There is also minimal perfect hashing.
Widely useful, hash tables are found in a wide variety of programs. They have been added as primitives to many programming languages (such as Python, Perl, and Lua) where they are sometimes called dictionaries. In these languages, hash tables tend to be used extensively as data structures, sometimes replacing records and arrays. The primary disadvantage of hashing is that logically "adjacent" keys can't be easily located (ie, after locating a key, there typically isn't any special operation for immediately finding the previous or next key in the sorted set of keys).