1/03/2012

Hash Table(Reference WIKIPEDIA)

A Hash Table is a data structure for storing key/value pairs.


Slot: One unit for storing data, the container of data. 
Hash function: maps key to the index of the slot.calculates an index from the data item's key and use this index to place the data into the array.
Hash collision: hash function maps two different keys to one slot.
Load factor:  the ratio n/s between n and the size s of its array of buckets. It represents the portion of the s buckets in the structure that are filled with one of the n stored entries. If it approaches 0, it means that many buckets are empty and memory is wasted.


Collision resolution:
1. Chaining
  • Each slot of the bucket array is a pointer to a linked list that contains the key-value pairs that hashed to the same location.
  • Lookup worst case: O(n). All the entries map to one single slot.
    Hash collision resolved by separate chaining.
    Hash collision by separate chaining with head records in the bucket array.


    2. Open Addressing

    • All entry records are stored in the bucket array itself. 
    • When a new entry has to be inserted, the buckets are examined, starting with the hashed-to slot and proceeding in some probe sequence, until an unoccupied slot is found.


    Advantage:
    Lookup speed is fast. But when there are many collisions, the speed will be decreased.
    So it's important to write a good hash function. In the same time of keeping load factor, reducing collision.


    Difference between Hash Table and Hash Map

    • Hash Table is synchronized, but Hash Map is not.
    • Hash Table inherits from Dictionary, but Hash Map inherits from Map.
    • Hash Table does not allow null key or null value. But Hash Map allows. It can have multiple keys maps to null value, but it can only have one null key. Also, it must use containsKey() to check whether one specific key exists. Because when it returns null when using get(). It could be either the key does not exist or the value of the key is null.
    • Hash Table uses Enumeration, but Hash Map uses Iterator.
    • In Hash Table, the default size of hash array is 11, and increased to 2*old +1 each time. In Hash Map, the default size is 16, and when resizing, the size doubles.
    • In Hash Table, it uses hashCode() directly, but in Hash Map, it recalculates the hash value and use & instead of mod.

    No comments:

    Post a Comment