Description
Reopening issues #762 and #1110, this is still happening in 2019 and maybe I found the cause.
Briefly, error bars displayed by error_x and error_y appear in the wrong order when grouping data by color or (as I found) symbol. Recycling @Cristoforetti 's code from #1110:
df<-data.frame("X"=c(1:20),
"Y"=c(1:20),
"SD"=c(1:20),
"G"=c(rep("A",3),rep("B",5),rep("C",4),rep("D",5),rep("A",3)),
"g"=c(rep("a",2),rep("b",5),rep("c",4),rep("d",5),rep("e",4))
)
# no grouping; error bars correct
p1<-plot_ly(df,
x=~X,
y=~Y,
type="scatter",
mode="markers",
error_y =list(
array=~SD,
thickness=1
)
)
# grouping by color; error bars wrong
p2<-plot_ly(df,
x=~X,
y=~Y,
color=~G,
type="scatter",
mode="markers",
error_y =list(array=~SD,
thickness=1
)
)
subplot(p1,p2)
I found that the correct behaviour (error bars associated with the correct data points) can be restored by passing a version of the input dataframe order-ed by the color column:
# ordering the input data frame by the color column yields the correct behaviour
p3<-plot_ly(df[order(df$G),],
x=~X,
y=~Y,
color=~G,
type="scatter",
mode="markers",
error_y =list(array=~SD,
thickness=1
)
)
subplot(p1,p2,p3)
The same happens when ordering by symbol: the error bars are screwed up unless df is order-ed by the symbol column. When using both color and symbol, one must order by both the color column and the symbol column in this order:
# using "G" for color and "g" for symbols; ordering by G, then g yields the correct behaviour
p4<-plot_ly(df[order(df$G,df$g),],
x=~X,
y=~Y,
symbol=~g,
color=~G,
type="scatter",
mode="markers",
error_y =list(array=~SD,
thickness=1
)
)
# ordering by g, then G yields the wrong behaviour
p5<-plot_ly(df[order(df$g,df$G),],
x=~X,
y=~Y,
symbol=~g,
color=~G,
type="scatter",
mode="markers",
error_y =list(array=~SD,
thickness=1
)
)
subplot(p4,p5)
What seems to be happening here is that color and symbol are reordering the (copy of) df handled by plotl_ly, but for some reason the reordering only affects the columns identified by x, y, color and symbol; as a consequence, the column used by error_y is now in the wrong order, and error bars are associated with the wrong data points. Ordering the df prior to calling plot_ly, or passing an ordered version of it, solves the issue. Still, it would be great to see this fixed in future versions of plotly.