I'm studying CLRS section 21.3, which introduces a union rank + path-compressed implementation of disjoint set. The implementations of MAKE-SET, UNION, LINK, and FIND-SET on p 571 of the book all work with nodes themselves, where each node has a parent
pointer and rank
.
Exercise 4 in that section claims that if you wanted to traverse all the nodes in the set (e.g. to print them) you only need a single extra attribute for each node, and the book claims that you can implement this without sacrificing the asymptotic complexity of union rank + path compression.
The data structure that immediately comes to mind is a circular list with single (not double) links. In this case, the "single extra attribute" the exercise asks for would be the next
pointer of each node. We could just maintain that list in parallel with the disjoint-forest + union rank + path-compression.
However, the problem is that when trying to merge two sets, you would also need to merge their circular lists, and for a singly linked circular list, I'm pretty sure you cannot combine them into one without traversing through one of them first. This traversing would certainly destroy the asymptotic efficiency of union rank + path compression here.
Of course, you could slightly augment the circular list by keeping track of both its head and the node before the head, i.e. the tail. However, in order to do this, you can no longer work directly with the nodes, but would need to maintain a circ-list
object (one for each set) with head
and tail
pointers. This would also mean that you can no longer call UNION on two nodes directly, but rather need to maintain a set
object that has a pointer to its circ-list
object as well as its root
. All this seems to be somewhat more than the "exactly one extra attribute for each node`, since you also need O(1) extra attributes for each set.
Is what the exercise asking for actually possible, or do I need to make one of the concessions above?