Skip to content

BUG: plotting with DatetimeIndex containing NaT #12405

Closed
@jorisvandenbossche

Description

@jorisvandenbossche

xref #8914

In [1]: %matplotlib
Using matplotlib backend: Qt4Agg

In [2]: df = pd.DataFrame({'date': pd.date_range('2016-01-01', periods=5), 'vals
': range(5)})

In [3]: df.loc[2, 'date'] = np.nan

In [4]: s = df.set_index('date')['vals']

In [5]: s
Out[5]:
date
2016-01-01    0
2016-01-02    1
NaT           2
2016-01-04    3
2016-01-05    4
Name: vals, dtype: int64

In [6]: ax = s.plot()

In [11]: ax.get_lines()[0].get_data()
Out[11]:
(array([datetime.datetime(2016, 1, 1, 0, 0),
        datetime.datetime(2016, 1, 2, 0, 0),
        datetime.datetime(2016, 1, 4, 0, 0),
        datetime.datetime(2016, 1, 5, 0, 0), NaT], dtype=object),
 array([0, 1, 3, 4, 2], dtype=int64))

In [44]: ax.get_lines()[0].get_xydata()
Out[44]:
array([[  7.35964000e+05,   0.00000000e+00],
       [  7.35965000e+05,   1.00000000e+00],
       [  7.35967000e+05,   3.00000000e+00],
       [  7.35968000e+05,   4.00000000e+00],
       [  6.12411009e+05,   2.00000000e+00]])

So this gives you a plot with one of the values in the year 1677, September 22, so the minimum possible Timestamp.

This is of course not correct, but there are two things:

  • NaT gets converted to Timestamp.min
  • the order of the values is changed, as the NaT is put at the end

There is a related issue about plt.plot(pd.NaT) erroring (#9253), but in this case it is pandas code that gives wrong results, so we have more control over this. I find it also strange that this does not error as in the pd.NaT case, but converts the NaT to Timestamp.min (but didn't look into detail)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions