hashing

views updated

hashing A technique that is used for organizing tables to permit rapid searching or table lookup, and is particularly useful for tables to which items are added in an unpredictable manner, e.g. the symbol table of a compiler. Each item to be placed in the table has a unique key. To place it in the hash table a hash function is used, which maps the keys onto a set of integers (the hash values) that range over the table size. The function is chosen to distribute the keys fairly evenly over the table (see hashing algorithm); since it is not a unique mapping, two different keys may map onto the same integer.

In the simplest version of the technique, the hash value identifies a primary position in the table; if this is already occupied, successive positions are examined until a free one is found (treating the table as circular). The item with its key is inserted in the table at this position. To locate an item in the table a similar algorithm is used. The hash value of the key is computed and the table entry at this position is examined. If the key matches the required key, the item has been located; if not, successive table positions are examined until either an entry with a matching key is found or an empty position is found. In the latter case it can be concluded that the key does not exist in the table, since the insertion procedure would have placed it in this empty position. For the technique to work, there must be rather more table positions than there are entries to be accommodated. Provided that the table is not more than 60% full, an item can on average be located in a hash table by examining at most two table positions.

More sophisticated techniques can be used to deal with the problem of collisions, which occur when the position indicated by the hash value is already occupied; this improves even further the performance of the table lookup. Table lookup and insertion of new items can be interleaved, but if items are deleted from the table the space they occupied cannot normally be reused.