Skip to content

Grouping by color and/or symbol changes the order of error_y bars #1657

Open
@angiachino

Description

@angiachino

Reopening issues #762 and #1110, this is still happening in 2019 and maybe I found the cause.
Briefly, error bars displayed by error_x and error_y appear in the wrong order when grouping data by color or (as I found) symbol. Recycling @Cristoforetti 's code from #1110:

df<-data.frame("X"=c(1:20),
"Y"=c(1:20),
"SD"=c(1:20),
"G"=c(rep("A",3),rep("B",5),rep("C",4),rep("D",5),rep("A",3)),
"g"=c(rep("a",2),rep("b",5),rep("c",4),rep("d",5),rep("e",4))
)
# no grouping; error bars correct
p1<-plot_ly(df,
            x=~X,
            y=~Y,
            type="scatter",
            mode="markers",
            error_y =list(
              array=~SD,
              thickness=1
            )
)
# grouping by color; error bars wrong
p2<-plot_ly(df,
            x=~X,
            y=~Y,
            color=~G,
            type="scatter",
            mode="markers",
            error_y =list(array=~SD,
                          thickness=1
            )
)
subplot(p1,p2)

image
I found that the correct behaviour (error bars associated with the correct data points) can be restored by passing a version of the input dataframe order-ed by the color column:

# ordering the input data frame by the color column yields the correct behaviour
p3<-plot_ly(df[order(df$G),],
            x=~X,
            y=~Y,
            color=~G,
            type="scatter",
            mode="markers",
            error_y =list(array=~SD,
                          thickness=1
            )
)
subplot(p1,p2,p3)

image
The same happens when ordering by symbol: the error bars are screwed up unless df is order-ed by the symbol column. When using both color and symbol, one must order by both the color column and the symbol column in this order:

# using "G" for color and "g" for symbols; ordering by G, then g yields the correct behaviour
p4<-plot_ly(df[order(df$G,df$g),],
            x=~X,
            y=~Y,
            symbol=~g,
            color=~G,
            type="scatter",
            mode="markers",
            error_y =list(array=~SD,
                          thickness=1
            )
)
# ordering by g, then G yields the wrong behaviour
p5<-plot_ly(df[order(df$g,df$G),],
            x=~X,
            y=~Y,
            symbol=~g,
            color=~G,
            type="scatter",
            mode="markers",
            error_y =list(array=~SD,
                          thickness=1
            )
)
subplot(p4,p5)

image

What seems to be happening here is that color and symbol are reordering the (copy of) df handled by plotl_ly, but for some reason the reordering only affects the columns identified by x, y, color and symbol; as a consequence, the column used by error_y is now in the wrong order, and error bars are associated with the wrong data points. Ordering the df prior to calling plot_ly, or passing an ordered version of it, solves the issue. Still, it would be great to see this fixed in future versions of plotly.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions