Description
Code Sample, a copy-pastable example if possible
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.cut.html
bins = pd.IntervalIndex.from_tuples([(0, 1), (2, 3), (4, 5)])
pd.cut([0, 0.5, 1.5, 2.5, 4.5], bins)
[NaN, (0, 1], NaN, (2, 3], (4, 5]]
Categories (3, interval[int64]): [(0, 1] < (2, 3] < (4, 5]]
Problem description
Proposed example in pd.cut IntervalIndex section does not take into consideration actual pd.cut behaviour which in above example results produces ranges with lots of missing values and nans in results. In docs example above for example intermediate values like 2 and 4 WILL NOT be included in any bins, so the actual values of 2 and 4 in the data will produce nans after cutting the attribute using pd.cut.
I assume here that user in 99% of the time when using cut to bucketize value space wants all values in the spectrum to be included. This usage example can lead to data loss.
Actual example should be along the lines:
bins = pd.IntervalIndex.from_tuples([(0, 1), (1, 3), (3, 5)])
resulting in the following bins:
Categories (3, interval[int64]): [(0, 1] < (1, 3] < (3, 5]]
EDIT:
There is a correct/proper example in IntervalIndex docs: