Skip to content

BUG: IntervalIndex constructor inconsistencies #18421

Closed
@jschendel

Description

@jschendel

Code Sample, a copy-pastable example if possible

  1. IntervalIndex constructor ignores closed parameter for purely NA data:
In [3]: pd.IntervalIndex([np.nan], closed='both')
Out[3]:
IntervalIndex([nan]
              closed='right',
              dtype='interval[float64]')

In [4]: pd.IntervalIndex([np.nan, np.nan], closed='neither')
Out[4]:
IntervalIndex([nan, nan]
              closed='right',
              dtype='interval[float64]')

This only occurs on master, as it appears to be an over-correction resulting from the fix in #18340


  1. IntervalIndex also ignores closed when it conflicts with the how the input data is closed:
In [6]: ivs = [pd.Interval(0, 1, closed='both'), pd.Interval(10, 20, closed='both')]

In [7]: pd.IntervalIndex(ivs, closed='neither')
Out[7]:
IntervalIndex([[0, 1], [10, 20]]
              closed='both',
              dtype='interval[int64]')

The behavior above occurs on master, and is a result of #18340. The opposite behavior occurred prior, where intervals would always be coerced to match the closed specified by the constructor. Should probably raise when the constructor vs. inferred closed conflict.


  1. Inconsistent dtype for empty IntervalIndex depending on the method of construction:
In [2]: pd.IntervalIndex([]).dtype
Out[2]: interval[object]

In [3]: pd.IntervalIndex.from_intervals([]).dtype
Out[3]: interval[object]

In [4]: pd.IntervalIndex.from_breaks([]).dtype
Out[4]: interval[float64]

In [5]: pd.IntervalIndex.from_tuples([]).dtype
Out[5]: interval[float64]

In [6]: pd.IntervalIndex.from_arrays([], []).dtype
Out[6]: interval[float64]

Expected Output

  1. IntervalIndex constructor should not ignore the closed parameter for purely NA data (since it can't infer closed from the input data).

  2. IntervalIndex should raise when given conflicting closed vs. inferred closed from data.

  3. IntervalIndex should have the same dtype for empty data regardless of the method of construction. It's not immediately clear to me which dtype should be used, but my feeling is interval[object] since that's the behavior of the constructor/from_intervals.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions