Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible bug in immutable RRB tree implementation #31

Closed
jafingerhut opened this issue Oct 30, 2019 · 12 comments
Closed

Possible bug in immutable RRB tree implementation #31

jafingerhut opened this issue Oct 30, 2019 · 12 comments
Labels

Comments

@jafingerhut
Copy link
Contributor

jafingerhut commented Oct 30, 2019

The reproduction of the sequence I have below in Clojure, since it is adapting some Clojure test for another library, core.rrb-vector, that I found this behavior. Let me know if you have any questions or would prefer it written in Java.

user=> (def empty-vector (org.organicdesign.fp.collections.RrbTree/empty))
#'user/empty-vector
user=> (class empty-vector)
org.organicdesign.fp.collections.RrbTree$ImRrbt
(defn append-elems-to-vec [v s]
  (reduce (fn [v elem]
            (.append v elem))
          v
          s))
#'user/append-elems-to-vec
user=> (def v1 (append-elems-to-vec empty-vector (range 44)))
#'user/v1
user=> (class v1)
org.organicdesign.fp.collections.RrbTree$ImRrbt
user=> v1
(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43)
user=> (def v2 (append-elems-to-vec empty-vector (range 45)))
#'user/v2
user=> (def v3 (.join v1 v2))
#'user/v3
user=> (class v3)
org.organicdesign.fp.collections.RrbTree$ImRrbt
user=> v3
(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44)
user=> (def v4 (append-elems-to-vec v3 '(-1)))
#'user/v4
user=> v4
(-1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44)
user=> (.get v4 0)
-1

It appears that the element -1 is put at the beginning of the vector v4, whereas it should have been the last element.

This does not happen if I make v1 or v2 shorter, but it does also happen for some longer lengths of v1 and v2 as well.

I am using Paguro version 3.1.2

@GlenKPeterson
Copy link
Owner

GlenKPeterson commented Oct 30, 2019

I'm on holiday from my computer right now, but a quick look suggests that you have found a bug. After joining two vectors of sufficient (or specific) length, elements are appended to the beginning instead of the end. Oops!

The "join" implementation of Paguro's RRB-Tree is ugly, but has seemed to work until you found this. I was able to follow a paper for the rest of the implementation but I didn't understand their description of join, so hacked together my own. I've been meaning to simplify that algorithm, but life... Also, I couldn't actually find anything wrong with it until now, so had little inspiration to fix it.

In any case, I regret that it will probably be a month or more before I can work on this. If you wish to make a patch before then, you might:

  1. add a unit test that fails
  2. make a minimal-change fix to make it pass
  3. submit a pull request.
  4. I'll review it.

If you want to make bigger changes, it will be easier for me to build confidence in you with a smaller change before starting a major cleanup of the join method. But you may need to clean it in order to fix it. If so, please try to explain what you are doing so I can understand it in order to accept your patch, and be willing to have some discussion about your proposed changes. Still, minimizing changes and matching style for the first round or two would be a big help.

I can scan and send you my scribblings on the join method and/or talk to you on the phone about it if that helps. The authors of the original paper did not write clearly about the join method, but it wouldn't hurt to read that too, and/or look at the Scala implementation. Or maybe the Clojure implementation has a really clean join implementation?

@GlenKPeterson
Copy link
Owner

Thinking more: Sounds like the "focusStartIndex" is set to zero instead of size - 1 after a join of a certain length. Could that happen right here?
https://github.com/GlenKPeterson/Paguro/blob/master/src/main/java/org/organicdesign/fp/collections/RrbTree.java#L285
and here?
https://github.com/GlenKPeterson/Paguro/blob/master/src/main/java/org/organicdesign/fp/collections/RrbTree.java#L714

Possibly other places as well.

@jafingerhut
Copy link
Contributor Author

Thanks for the ideas of where to look. I may try seeing if I can find a fix for this issue, and if I come up with something that looks half-way reasonable, I'll submit a PR.

@jafingerhut
Copy link
Contributor Author

And I agree with your assessment of the authors of the RRB tree paper that they do not delve into the details and corner cases that seem to arise when implementing this data structure. I have yet to find a bug-free implementation of it, so you are in good company :-)

@jafingerhut
Copy link
Contributor Author

In an earlier comment above, you say: "I was able to follow a paper for the rest of the implementation but I didn't understand their description of join, so hacked together my own."

Do you happen to recall which paper you were following for this?

FYI, I am writing up some careful proofs of how to do split and join operations on B-trees for my own interest. It is still a work in progress, but in case you are curious: https://github.com/jafingerhut/core.btree-vector/blob/master/doc/intro.md

@jafingerhut
Copy link
Contributor Author

The document I linked to in my previous comment is now done, at least as far as describing algorithms for doing split and join operations on B-trees, with some examples.

@GlenKPeterson
Copy link
Owner

The paper was Phil Bagwell, Tiark Rompf, "RRB-Trees: Efficient Immutable Vectors", EPFL-REPORT-169879, September, 2011 [PDF] [SemanticScholar page]

I look forward to reading your document!

@raner
Copy link

raner commented Jun 18, 2021

@GlenKPeterson can you provide an update of the status of this bug? I see that both the join() and the without() method in RrbTree.ImRrbt are marked as deprecated for now. Unfortunately, this limits the use of the RrbTree.ImRrbt class to a few select use cases (e.g., building lists by adding one element at a time), and it cannot be used as an efficient general implementation for lists and vectors (which is what I'm after).
What is the guidance for now? Should developers use the PersistentVector that was adapted from Clojure, or is there some other alternative? Thanks :-)

@GlenKPeterson
Copy link
Owner

For now, yes, use the PersistentVector instead (StaticImports has a vec() method for making these briefly).

@jafingerhut indicated that he had not found a bug-free RRB-Tree and provided test cases that proves there are bugs in this one. Is there one now? He also provided a fix for one of these bugs, and a paper proving that joins were possible in Log(n) time, all of which were incredibly helpful.

What we are lacking now is a fix for the last known bug. There are two problems (aside from no-one having gotten this 100% right yet):

  1. The code complex. I suggest we remove the optimization where it builds a "Strict" PersistentVector tree when it can and just only let it build a tree of variable width nodes. Get that to work correctly and as simply as possible, then look at opportunities for reducing constant factors (Andy suggested mixing sub-nodes and counts in one array - a good idea). Or not. I mean, the PersistentVector part has no known bugs. I guess it just feels wrong to me to have this much complexity when it doesn't even work 100%.
  2. I implemented the simplest version of join() that I could think of - basically just make the appropriate number of parent nodes to stick the two trees together. The problem with that is that the right-most edge of a full tree allows partly-full nodes. If you take such a tree and just stick it onto the left side of a bigger tree, what was the right-most edge is now an inner edge, where partly-full nodes are illegal.

Andy feels I'm mistaken about this and I can't understand his explanation as to why. I also haven't been able to put forth the amount of mental effort to study how to resolve this problem.

Here are my notes explaining what I see as the issue. I'm using Andy's terms where "b" is the minimum size for an RRB-Tree node and "B" is the maximum. "S" is the strict node size, which is really just a special case of a Relaxed node. I should edit out S and Strict in my explanation, but I don't want to introduce any errors into the example, so I'm leaving it for now. Also, you can get the existing code to print out these example trees.

In this example b=3, S=4, and B=5. This makes narrow deep trees which show this problem more. What I'm showing here is just the left-hand tree on the left, then a => arrow, then how we might transform it so it would be a valid inner node of a larger tree (inner nodes have to be between b and B size).

Smallest valid tree of any kind: left.size = b:
Works:
[1 2 3] => Valid relaxed leaf node tree as is

left.size = S:
Works:
[1 2 3 4] => Valid strict leaf node tree as is

Left.size = B
Works:
Strict([1 2 3 4]  => [1 2 3 4 5] Valid relaxed leaf node tree
       [5])

B < left.size < b^2
ERROR: Not a valid left-hand tree for a join.
Strict([1 2 3 4] => Relaxed([1 2 3]
       [5 6])               [4 5 6]) is an incomplete node because it only contains 2 sub-trees (must contain 3-5 elements)!

B < left.size < b^2
ERROR: Not a valid left-hand tree for a join.
Strict([1 2 3 4] => Error - many possible combinations, all still incomplete
       [5 6 7])

B < left.size < b^2
ERROR: Not a valid left-hand tree for a join.
Strict([1 2 3 4] => Error - still incomplete
       [5 6 7 8])

left.size = b^2
Works:
Strict([1 2 3 4] => Relaxed([1 2 3]
       [5 6 7 8]            [4 5 6]
       [9])                 [7 8 9]) is valid because it has at least b nodes and each has at least b sub-nodes all the way down.

...

left.size = S
Works:
Strict([1 2 3 4] => Valid strict tree as is.
       [5 6 7 8]    
       [9 10 11 12]
       [13 14 15 16])

left.size > S
Works, but algorithm is more complicated because we're merging nodes at two levels.
Strict(Strict([1 2 3 4]      => Relaxed([1 2 3 4]
              [5 6 7 8]                 [5 6 7 8]
              [9 10 11 12]              [9 10 11 12]
              [13 14 15 16])            [13 14 15 16 17]) is valid
       Strict([17]))

Strict(Strict([1 2 3 4]      => Relaxed([1 2 3 4]
              [5 6 7 8]                 [5 6 7 8]
              [9 10 11 12]              [9 10 11 12]
              [13 14 15 16])            [13 14 15]
       Strict([17 18]))                 [16 17 18]) is valid

Strict(Strict([1 2 3 4]      => Relaxed([1 2 3 4]
              [5 6 7 8]                 [5 6 7 8]
              [9 10 11 12]              [9 10 11 12]
              [13 14 15 16])            [13 14 15 16]
       Strict([17 18 19]))              [17 18 19]) is valid

...
Strict(Strict([1 2 3 4]      => Relaxed([1 2 3 4]
              [5 6 7 8]                 [5 6 7 8]
              [9 10 11 12]              [9 10 11 12]
              [13 14 15 16])            [13 14 15 16]
       Strict([17 18 19 20]             [17 18 19 20 21]) is valid
              [21]))            Other solutions exist.

Even uglier - splits and merges at 2 levels:
Strict(Strict([1 2 3 4]      => Relaxed(Relaxed([1 2 3 4]
              [5 6 7 8]                         [5 6 7 8]
              [9 10 11 12]                      [9 10 11 12])
              [13 14 15 16])            Relaxed([13 14 15 16]
       Strict([17 18 19 20]                     [17 18 19]
              [21 22]))                         [20 21 22])) is valid
                                Other solutions exist.

I think if we can make the fewest adjustments at the highest levels we win. So I've tried to pick solutions that affect the fewest leaf nodes for the examples.

At least for B < left.size < b^2, we have to merge multiple tree fragments from the left into the tree on the right. I suppose we could just treat these as multiple merges. For left.size=6 we could "join" [4 5 6] to the right tree, then "join" [1 2 3]?

The max number of "micro-joins" it could take to complete a full join would then be b^2 - B. Does that change the Big O character of our algorithm? If S=32, b=22 and B=43, that's
22*22 - 43 =
484 - 43 =
441 operations.
Considering the maximum size of an integer allows a log base 32 of 7, I think this is very questionable. The "squared" in b^2 operations sounds like another Big O entirely. Though I guess if it happens the square root of the number of possible concatenations it could even itself out.

So I'm back to the zipper analogy. Trying to think of the right-fringe of the left tree and left-fringe of the right tree that we have to combine into 1 or 2 parent nodes at each level, generally leaving all but 2 leaf nodes alone. I think this is more O(log n) thinking.

@raner
Copy link

raner commented Jun 22, 2021

Thanks for this comprehensive and detailed explanation, @GlenKPeterson
I'm not sure I'll have the time to help out with this, but at least for my own use it's clear what to do now. I'll be using PersistentVector for the general case, but I think I can still use RrbTree.ImRrbt for a few select cases where I don't need joining of trees or removing of elements.

@GlenKPeterson
Copy link
Owner

Thanks so much @jafingerhut for all your help with this! I believe this issue is fixed now in release 3.7.0.

@GlenKPeterson
Copy link
Owner

I'm closing this as fixed. Please report any bugs. The current best release for the RRB-Tree is 3.7.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 participants