Skip to content

Empty Facets on partially filled MultiIndex DataFrame with datetime.date in bar plot #4415

Open
@jonashoechst

Description

@jonashoechst

I have a pandas dataframe with a multicolumn index consisting of three levels, which I want to plot using a plotly.express.bar. First dimension Station (str) goes into the facets, the second dimension Night (datetime.date) goes onto the x-axis and the third dimension Limit (int) is going to be the stacked bars.

The bug is, that the resulting figure does not show any data on facets which do only have data at one Night. The expected outcome would be that even though data exists only on one date this data is presented.

A reproducible example looks like this:

import datetime
import pandas as pd
import plotly.express as px

df = pd.DataFrame([
    ["station-01", datetime.date(2023,10,1), 50, 0.1],
    ["station-01", datetime.date(2023,10,1), 100, 0.15],
    ["station-01", datetime.date(2023,10,2), 50, 0.2],
    ["station-01", datetime.date(2023,10,2), 100, 0.22],
    ["station-01", datetime.date(2023,10,3), 50, 0.05],
    ["station-01", datetime.date(2023,10,3), 100, 0.02],
    ["station-02", datetime.date(2023,10,1), 50, 0.5],
    ["station-02", datetime.date(2023,10,1), 100, 0.2],
    ["station-03", datetime.date(2023,10,1), 50, 0.5],
    ["station-03", datetime.date(2023,10,1), 100, 0.5],
], columns=["Station", "Night", "Limit", "Relative Duration"])
df = df.set_index(["Station", "Night", "Limit"])

px.bar(
    df,
    x=df.index.get_level_values("Night"),
    y="Relative Duration",
    range_y=[0, 1],
    color=df.index.get_level_values("Limit"),
    facet_col=df.index.get_level_values("Station"),
)

The figure created looks like this:
newplot (2)

When replacing the datetime.date type by a string the figure looks like expected:

import datetime
import pandas as pd
import plotly.express as px

df = pd.DataFrame([
    ["station-01", "101", 50, 0.1],
    ["station-01", "101", 100, 0.15],
    ["station-01", "102", 50, 0.2],
    ["station-01", "102", 100, 0.22],
    ["station-01", "103", 50, 0.05],
    ["station-01", "103", 100, 0.02],
    ["station-02", "101", 50, 0.5],
    ["station-02", "101", 100, 0.2],
    ["station-03", "101", 50, 0.5],
    ["station-03", "101", 100, 0.3],
], columns=["Station", "Night", "Limit", "Relative Duration"])
df = df.set_index(["Station", "Night", "Limit"])

px.bar(
    df,
    x=df.index.get_level_values("Night"),
    y="Relative Duration",
    range_y=[0, 1],
    color=df.index.get_level_values("Limit"),
    facet_col=df.index.get_level_values("Station"),
)

newplot (3)

Due to this behavior I suspect the plot yielded from the initial data frame is false and there is a bug inside of the plotly.express code.

Thanks for helping out.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P3backlogbugsomething brokensev-2serious problem

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions