Saturday, April 9, 2016

HashTable and collisions

What is an good Hashfunction?

  • Evenly distributed
  • Easy to compute
  • Less collision

lambda (load factor) = N elements / T(table size)


How to Handle collisions?

Separate Chaining 
The idea is to keep a list of all elements that hash to the same value. Have a linked list and add the element at the beginning for the linked list.

Advantages -
  • Better space utilization for larger items
  • Simple collision handling - search in linked list
  • Overflow - Can store more elements in the hashtable than the number of of cells.

Open Addressing
Linear Probing  - In case of a collision problem to the next place until, we find an empty space.
Other techniques are Qudratic probing and double hashing.

Disadvantages

  • As long as the table is enough, a free cell can be found.
  • Worst even if the table is empty but blocks of occupied cells are formed - Primary Clustering
Acceptable if the Lambda < 0.5
Lambda here is average number of probe for all searches and insertions

Quadratic Probing - Similar to linear probing, instead the collision function f(i) = i^2. This helps to get away from primary clustering.

Double Hashing 
  • The second hash function should never be equated to zero





No comments:

Post a Comment