Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad interaction between GraphSearcher pooling in GraphIndexBuilder and GraphIndexBuilder closing GraphSearcher #272

Open
jkni opened this issue Apr 10, 2024 · 4 comments
Assignees

Comments

@jkni
Copy link
Collaborator

jkni commented Apr 10, 2024

GraphIndexBuilder uses explicit thread-locals to pool GraphSearchers. It closes these GraphSearchers, which closes their views, which causes active view tracking to malfunction for OnHeapGraphIndex.

If we update GraphIndexBuilder to not close these views, they stay active indefinitely. We need to manage these views correctly, ideally while retaining most of the benefits of pooling.

@jbellis
Copy link
Owner

jbellis commented Apr 11, 2024

If/when we add back concurrent removeDeletedNodes, this is the PR where it got removed: #273

@jkni jkni self-assigned this Apr 12, 2024
@jkni
Copy link
Collaborator Author

jkni commented Apr 15, 2024

The tension here is between the need to shorten the lifetime of OnHeapGraphIndex views and the GraphIndexBuilder pooling of GraphSearchers, which keep a view for their lifetime. We want GraphIndexBuilders to pool GraphSearchers to reduce allocations of associated data structure, but because the OHGI view is used as a reference point for concurrent deletions, it can't be retained indefinitely. Here are a few ideas I've tried, none of which are elegant:

  • Make GraphSearcher take a graph (instead of a view) and open a view per search. This works fairly cleanly, but the tricky part is that a search that may be resumed needs to be retained. This means we need an additional API surface to indicate to a GraphSearcher that we won't resume a search.
  • Make GraphSearchers re-openable. This allows us to close the GraphSearcher when done searching, like we do in GraphIndexBuilder. Then, if a GraphSearcher is used again, it can detect that it has been closed and get a new view (either through a new View method or taking a GraphSearcher like above). This is one way to do the above API surface.
  • Add a GraphSearcher constructor that allows the backing data structure to be passed in. This allows us to pool these structure in GIB instead of the actual searchers. This slightly reduces the effectiveness of pooling and leaks implementation details about the GraphSearcher, but it could be workable.

There are a couple ways to approach this, but none neatly resolve the tension of wanting to keep OHGI views short-lived and keep searchers pooled. We could just special-case something.

@jbellis
Copy link
Owner

jbellis commented Apr 15, 2024

cc @mdogan since I think you're potentially interested in having a concurrent removeDeletedNodes

@mdogan
Copy link
Collaborator

mdogan commented Apr 16, 2024

I think GraphSearcher taking a view parameter externally but closing it looks a bit off, asymmetric. If GraphSearcher closes the view then it should create it upfront, not expect it to be created outside.

So in this sense, 1st option sounds fine to me, making GraphSearcher to take a graph. We can just add a new flag (resumable?) for search method or have two different methods for resumable and non-resumable searches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants