Union-Find (aka Disjoint Set) Data Structure
If you are already not familiar with the Union-Find data structure, check out this simple tutorial to understand the data structure. The images help a lot.
Naive Implementation without Optimizations
The worst time complexity for find()
or union()
is O(n) since it may need to run through the whole chain of parents. So in the worst case, m operations can take O(mn) time.
Optimizations
Union-by-size
Modify union()
to make the worst-case time of Θ(log n).
Path compression
find()
has an amortized time nearly Θ(1)
With both Union-by-size + Path compression
Worst-case time complexity is Θ(m α(m, n)). for m>n operations, where n is the number of elements in the Union-Find tree.
Now, α(m, n) is constant when amortized over a big number of operations. So this turns the time complexity of m operations is just O(m) so each union or find operation is just O(1) on average!