Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(insideOut.js): fix the bug that stack orders do not show by onset… #126

Closed
wants to merge 1 commit into from

Conversation

ZijingPeng
Copy link

… time when it is a ThemeRiver

By reading the paper you mentioned in the d3-shape API, I think there might be something wrong in d3-shape/src/order/insideOut.js. I think what the writer of the paper means is that “inside-out” is an ordering that sorts the layers by onset time and add layers alternately to the beginning and end of a layer list, whereas this method has a drawback, that is the when simply alternately adding the layer, it may lead to some asymmetric pattern—the top of the streamgraph can be much larger than the bottom of it. In this circumstance, we can use the sum of each layer as the weight to judge whether to put the layer on the top or the bottom. I notice that your code only use the sum of the layer to sort without using the onset time to sort the series first, which I think is very significant and shouldn’t be left out. And I just modified your source code and add sorting by onset time codes.

Following is the related part of the paper Stacked Graphs – Geometry & Aesthetics by Lee Byron & Martin Wattenberg.

One might consider sorting the data set by “onset time”. If the “new” layers are always added along the top, the graph takes on a distracting downward diagonal stripe pattern in addition to an upward angle to the overall silhouette due to the layout algorithm’s effort to keep the sum of slopes low (fig 13).

To prevent this, layers are given a “inside-out” ordering, in which early-onset time series are placed at the middle, with later-onset series at the top and bottom. This has three benefits in addition to avoiding the diagonal-stripe effect. First, it places the biggest bursts in the layers—the first non-zero value—at the outside the graph, where they will disrupt the layout of other layers the least, drastically improving legibility, design issues (A-C). Second, we speculate that the top and bottom regions of the graph tend to be most prominent areas, since they occur near the high-contrast silhouette. The central “core” of the graph, the middle, may be read secondarily. Since the bursts are the most “interesting” part of the data in many cases, the inside-out layout places them in the potentially prominent position (fig 14). Third, it prevents a drift of the layout away from the x-axis, an artifact that can be seen dramatically in fig 13.

The particular inside-out ordering is defined as follows. Note that one easy method would be simply to sort the layers by onset time, and then add layers alternately to the beginning and end of a layer list.

Unfortunately, this simple method could potentially lead to a highly asymmetric graph if the layers that end up at the beginning of the list represent much larger values than the ones at the end. To prevent this asymmetry, we use the following algorithm in ordering the layers. First, we define the “weight” of a time series as the sum of all its values. Then after sorting by onset time, we add time series to the list one by one, attempting to “even out” the weight between the top and bottom half: more precisely, if the sum of the weight of the first half of the current list is greater than half the total weight, we add the series the end; otherwise, we add to the beginning.

You can also read the paper in the website Stacked Graphs – Geometry & Aesthetics

@mbostock
Copy link
Member

Related #106.

@mbostock mbostock closed this in a9a2b52 Jan 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants