2
$\begingroup$

I want to design a binary tree with preallocated nodes, in order to avoid calling malloc/free every time I want to insert/delete a node. The problem is I don't know ahead of time how many nodes the tree will have, so I am thinking I need to allocate a block of nodes at a time, then allocate another block when the first gets used up, etc.

As far as I can tell, I need 3 data structures to accomplish this, but I'm hoping someone can recommend a simpler, more elegant method.

The three data structures I am thinking of are:

  1. Binary tree
  2. Stack (to store the preallocated nodes and easily find the next one to use)
  3. Linked list (to store the different allocated node blocks so they can be located and freed eventually).

The way these would work is:

Initialization

  1. Allocate one block of N nodes
  2. Push each node in the block onto the Stack
  3. Append block to Linked List.

Tree Insert

  1. If stack is empty, allocate another block of N nodes, push them onto the Stack and append it to Linked List
  2. Pop node from stack and store tree node data in it
  3. Add node to tree structure, do balancing, etc

Tree Delete

  1. Find item in tree and remove from tree
  2. Push node back onto Stack to be used later for insert

Cleanup

  1. Destroy tree
  2. Traverse Linked List and delete all node blocks

Using a Stack to store all the preallocated nodes seems like a good idea, since insert/delete operations would be $O(1)$. However, each time a new block of N nodes needs to be allocated, I need to push them all onto the Stack $(O(N))$ and insert it into the Linked List $(O(1))$. Finally, cleanup at the end requires traversing the Linked List $(O(NB))$ where $NB$ is the number of node blocks allocated.

It seems to me I should able to do this with less complexity (maybe 2 data structures instead of 3). Does anyone know of a more elegant method?

$\endgroup$

1 Answer 1

0
$\begingroup$

What you are doing is called (simple) "object pooling", a topic that has been discussed numerous times. It can be said to have nothing to do binary trees.

The simplest and yet efficient approach is to use a single stack or list, just like the C++ vector class or Java ArrayList, so that you do not have to maintain the "linked list" to keep track of the different allocated node blocks. Initially a block of $N$ nodes is allocated. Use a tail pointer to remember the first free node. Whenever it runs out of free nodes, just allocate $2N$ nodes or $cN$ nodes for a factor $c>1$ of your choice. Memory copy the existing nodes to the new block and free the previous block. Do not worry about the performance of copying nodes, which is just one (system) call that should have been optimized by your language runtime and operating system.

In general, do not write these kinds of utilities from scratch by yourself unless as an exercise to broaden your skill set or enhance your understanding. For example, there is an object pool in boost library if you can use C++. Also, do not do it just for the sake of performance unless you have identified node allocation and deallocation is the performance bottleneck. The recent c/c++ runtime might be fast enough for you on object allocation and deallocation.

I would recommend that you search "object pooling" on the web or, more directly, inside github.com with your choice of language, C if you want to dive further. You will be able to find many articles and many working implementations, (for example, simple object pool.

Finally, in many cases, especially in many programming contests, all you need to do is to preallocate a huge number of nodes like a few millions of them upfront. Then just use them without deallocation. Simple, stupid and wonderful.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.