Delete multiple nodes from a binary search tree at O(n) time

Question

Consider a binary search tree, where all keys are unique. Nodes haven't parent pointers.
We have up to n/2 marked nodes.
I can delete all of them at O(n²) time ( using postorder traversal and when encounter a marked node delete each at O(n)). But it is inappropriate. I need an algorithm to delete all marked nodes at O(n) time. Thanks.

EDIT After deletion I need to have nodes order unchanged.
EDIT-2 So it should look like I have deleted each marked node using the typical deletion (finding the rightmost node at the left subtree and exchanging it with the node to delete).

What do you mean by "the nodes order unchanged"? Both the answers currently given assume that the normal left-right key order of a binary search tree should be preserved. Do you mean that no node should have another parent than before? This is impossible to fulfill in the case where a marked child has two unmarked children. — njlarsson, Commented May 9, 2012 at 6:08

njlarsson · Accepted Answer · 2012-05-08 15:41:05Z

There are many ways, but here is one that should be easy to get right, and give you a perfectly balanced tree as a side effect. It requires linear extra space, however.

Create an array of size N minus the number of marked nodes (or N, if you don't know how many are marked and don't want to check it).
Place the elements in order in the array, by inorder traversal, skipping the marked elements. This takes linear time, and stack space proportional to the height of the tree.
Rebuild the tree top-down using the array. The mid element in the array becomes the new root, the mid element of the left subarray its new left child, and the mid element of the right subarray its new right subchild. Repeat recursively. This takes linear time and logarithmic stack space.

Update: forgot to say skipping the marked elements, but that was obvious, right? ;)

Tom Klino · Accepted Answer · 2015-07-23 15:02:54Z

I have found how to do it!

We use an inorder traversal.
We check if node need to be deleted before recursive calls of the function.
When we find a node to delete we stand flag toFindMax and searching the rightmost node in the left subtree.
If we encounter another node to delete we push all our variables to the stack and pop them when the node was deleted.
When we find the maximum in the left subtree we save reference to it and to it's parent.
And then, when we recursively return to the initial node to delete, we delete it (exchange it with the maximum).

static class LinearDeletion {

    public static Node MIN_VALUE = new Node(Integer.MIN_VALUE);;
    boolean toFindMax = false;
    Node parentOfMax = null;
    Node max = MIN_VALUE;
    Stack<Object> stack = new Stack<>();

    public void perform(Node x, Node parent) {

        if (x.isToDelete) {
            stack.push(toFindMax);
            stack.push(parentOfMax);
            stack.push(max);

            toFindMax = true;
            parentOfMax = null;
            max = MIN_VALUE;

            if (x.left != null) {
                perform(x.left, x);
            }


            if (x.left == null) { //deletion of the node
                if (parent.left == x) {
                    parent.left = x.right;
                } else {
                    parent.right = x.right;
                }
            } else {
                if (x.right == null) {
                    if (parent.left == x) {
                        parent.left = x.left;
                    } else {
                        parent.right = x.left;
                    }
                } else {
                    x.key = max.key;
                    x.isToDelete = max.isToDelete;
                    if (parentOfMax != x) {
                        parentOfMax.right = max.left;
                    } else {
                        x.left = max.left;
                    }
                }
            } // end of deletion
            max = (Node) stack.pop();
            parentOfMax = (Node) stack.pop();
            toFindMax = (boolean) stack.pop();
            if (toFindMax) { // check if the current node is the maximum
                if (x.key > max.key) {
                    max = x;
                    parentOfMax = parent;
                }
            }

            if (x.right != null) {
                perform(x.right, x);
            }

        } else {

            if (x.left != null) {
                perform(x.left, x);
            }

            if (toFindMax) {
                if (x.key > max.key) {
                    max = x;
                    parentOfMax = parent;
                }
            }

            if (x.right != null) {
                perform(x.right, x);
            }
        }
    }
}

Ted Hopp · Accepted Answer · 2012-05-08 16:32:28Z

1

I don't see why a postorder traversal would be O(n²). The reason that deleting a single node is O(n) is that you need to traverse the tree to find the node, which is an O(n) operation. But once you find a node, it can be deleted in O(1) time.^* Thus, you can delete all O(n) marked nodes in a single traversal in O(n) time.

^* Unless you need to maintain a balanced tree. However, you don't list that as a requirement.

EDIT As @njlarsson correctly points out in his comment, a delete operation is not normally O(1) even after the node is found. However, since the left and right subtrees are being traversed before visiting a node to be deleted, the minimum (or maximum) elements of the subtrees can be obtained at no additional cost during the subtree traversals. This enables an O(1) deletion.

edited May 8, 2012 at 16:32

answered May 8, 2012 at 15:22

Ted Hopp

234k48 gold badges406 silver badges528 bronze badges

You cannot delete a node in O(1) time after finding it. If the node has two subtrees, you have to combine them into one. The standard way (in a non-balanced tree) is to exchange the node to delete for the maximum element in the left subtree (or minimum in the right subtree), but this requires going to the bottom of the tree, which could be linear in an unbalanced tree.
– njlarsson
Commented May 8, 2012 at 15:48
@njlarsson - Good point. The problem is easily fixed, I think. One can define the postorder traversal to return the minimum (or maximum) node. Then when a node is visited, the minimum in the right subtree (or the maximum in the left subtree) would be immediately available for free in case the node needed to be deleted.
– Ted Hopp
Commented May 8, 2012 at 16:28
Yes, that should work, but only if each node has a pointer to its parent, because you need to modify the parent of the maximum/minimum too. If it doesn't, that can be fixed some way too, but it gets increasingly complex.
– njlarsson
Commented May 8, 2012 at 16:35
@njlarsson - If the tree is represented by nodes and pointers without parent pointers, then just return a pointer to the parent of the maximum (or minimum) node. There are a couple of edge cases to consider, but it's pretty easy to use just that. With some other binary tree representations, this isn't even an issue.
– Ted Hopp
Commented May 8, 2012 at 18:22

Add a comment |

Collectives™ on Stack Overflow

Delete multiple nodes from a binary search tree at O(n) time

3 Answers 3

Not the answer you're looking for? Browse other questions tagged
algorithm
binary-search-tree
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Not the answer you're looking for? Browse other questions tagged algorithmbinary-search-tree or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
algorithm
binary-search-tree
or ask your own question.