Table of Contents
Union find also called a Disjoint Set Datastructure. It stores 1 or more disjoint sets of elements.
The simplest way is to make an array which looks like,
F(x) = parent(x)
In this array, The index i
is the item/node and the value at parent[i]
is it's parent. Initially, Each node points to itself.
To find parent of a node, we can do
find (node)
if node != parent[node]
return find(parent[node])
return node
To check if two nodes are in the same set, we can do
find(node1) == find(node2)
This'll recursively check if the two nodes are in the same set since they'll have the same parent.
If we want to join 2 nodes, we can do
pa = find(a)
pb = find(b)
if pa != pb
parent[b] = a;
Optimizations
Path Compression. By default, We look up the full path and trace the parent. This can take a lot of time and we can reduce this by compressing the path. In the
find
function, We have already done the work of looking up a node's parent. We can update the node's parent to point to it's main parent directly. In future, Lookups will only take 1 hop rather than jumping more times so the look up can go from 1->2->3->4->5 to just 1->5We can keep depth of each node when we join two sets. Currently, We simply join the second tree into the first one. If the second tree was much larger than the first tree than look ups into the nodes in second tree will not take 1 more hop.
To avoid this, We can keep depths of each node and then join second into first OR first into second depending on the one that has less depth.