Hash Maps

Terminology

load factor

The amount of data points vs the amount of storage. (data.len / storage.capacity)

key

A value that is hashable and is used to look up data. (The hash has to be consistent)

value

A value that is associated with a key.

collision

When 2 keys map to the same cell.

Hashing

Because a Map has a limited amount of space, hashing keys will most certainly end up in collisions of hashes.

The collision happens because the produced hash will have to be modularized so that it always points to a valid slot in the Map.

circle-info

The less collision hashes a hash function generates, the closest to O(1)O(1) the hashmap can stay for its operations.

There are several ways of dealing with collisions, for instace:

Backof

Drawing

Avoiding backof

A way to avoid the linear or exponencial backof when two hashes collide, is to make these colliding hashes to ocupy the same slot in the Map, instead of finding an unocuppied slot.

Meaning that a slot would consist of a Linked List or similar DS.

Drawing

Growing Maps Data Storage

As the Data Storage become close to full, the number of collisions increase, thus making the Map less efficient, no matter the strategy used for dealing with collisions.

circle-info

The bigger the load factor the less O(1)O(1) the Map is.

The ideal limit load factor is about 0.7. Above that value, the Data Storage should grow.

Growing the Data Storage means, re-hashing all the existing { key, value }.

Implementation

Last updated