Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for updating point metadata without recreating coordinate index #174

Closed
5 tasks done
TheStanian opened this issue Apr 11, 2024 · 3 comments · Fixed by #178
Closed
5 tasks done

Support for updating point metadata without recreating coordinate index #174

TheStanian opened this issue Apr 11, 2024 · 3 comments · Fixed by #178
Labels
improvement Feature improvement or enhancement

Comments

@TheStanian
Copy link

TheStanian commented Apr 11, 2024

  • Check for duplicate issues.
  • Describe the feature's goal, motivating use cases, its expected behavior, and impact.
  • If you are proposing a new API, please explain how the current APIs fail to support your goal.
  • If you are proposing to change an existing API, please explain if your proposed is backward compatible or incompatible.
  • If applicable, include screenshots, GIFs, videos, or working examples.

Right now, the only way of updating the point metadata (valueA and valueB, or whichever alias you prefer) is to do a full draw call including coordinate data. I have a specific application in which only the color indices for the points would change, leading to the rebuilding of the KDBush, which is in this case a major unnecessary slowdown (10m+ points). I would like to see a way to change only the point metadata, so the rebuild doesn't need to occur. This could for example take the form of taking the existing state texture management as used in setPoints and exposing that on its own, using the currently available points. Doing so would not break backwards compatibility with anything whatsoever.

Thank you for your consideration and with kind regards,

Stan

@TheStanian TheStanian changed the title Add support for updating point metadata without recreating coordinate index Apr 11, 2024
@flekschas flekschas added the improvement Feature improvement or enhancement label Apr 11, 2024
@flekschas
Copy link
Owner

This is a great idea and certainly a feature I'd be happy to add. Technically I think there are a few ways to achieve this and they might all be useful to have on their own:

  1. We could allow users to directly pass the state texture to publicDraw(). This would be useful on its own when one wants to avoid the main thread having to handle state data at all.
  2. We could allow users to pass the kd tree index to publicDraw() (this needs updating of kdush, which I already did in feat: density-based point size #143). This would be useful on its own because for static data one could pre-compute the index.
  3. We should update publicDraw() to accept partial point data (i.e., only some components) and then have that function skip unnecessary updates. Internally we can, for instance, update the state texture using regl's buffer.subdata(). The biggest hurdle is that the draw function relies on row-major data (see toArrayOrientedPoints()). Refactoring this would generally be good.

What about the following non-breaking API changes to scatterplot.draw()?

type Draw = (
  data:
    | number[][]
    | Partial<{ x: number[] | Float32Array, y: number: [] | Float32Array, ... }>
    | Partial<DrawData>,
  options: Partial<{
    showPointConnectionsOnce: boolean,
    transition: boolean,
    transitionDuration: number,
    transitionEasing: string,
    preventFilterReset: boolean,
    hover: number,
    select: number[],
    filter: number[],
    zDataType: 'continuous' | 'categorical',
    wDataType: 'continuous' | 'categorical',
    // NEW
    kdbushIndex: ArrayBuffer
  }>
) => Promise<void>;

interface DrawData {
  points: Partial<Points> | Float32Array,
  connections: number[][],
}

interface Points {
  x: number[] | Float32Array,
  x: number[] | Float32Array,
  z: number[] | Float32Array,
  w: number[] | Float32Array,
}
@TheStanian
Copy link
Author

In terms of API changes, your Points interface has a double declaration of x, and could maybe be used in your type definition for Draw? Other than that it looks good to me.

I especially like option 3., as I feel leaning into structs of arrays might be worthwhile for further optimisations down the road (e.g. reducing temporary memory allocation by using typed arrays under the hood) while also avoiding to have to expose the casual user to internals (state texture, kd bush index) too much.

@flekschas
Copy link
Owner

flekschas commented Jun 1, 2024

@TheStanian It took a bit but I finally managed to support skipping the spatial index computation. See #178 for an example.

While I haven't had time to implement drawing of partial data, you can now precompute the spatial index (in a worker or main thread) and pass that index in with the draw call: scatterplot.draw(points, { spatialIndex }). In the PR I'm showing a test I ran in Firefox where updating the Z/W coordinates for one million points is sped up 6x by skipping the re-indexing.

I'm still planning to add drawing of partial data (e.g., scatterplot.draw({ z: [...] })) in which case the spatial indexing can be skipped automatically. But that's for another PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Feature improvement or enhancement
2 participants