ENH: pd.cut closed intervals

### Feature Type

- [X] Adding new functionality to pandas

- [ ] Changing existing functionality in pandas

- [ ] Removing existing functionality in pandas


### Problem Description

pd.cut and pd.qcut create intervals and partition the dataset according to those intervals.  The intervals are generally open on the left and closed on the right, and with continuous data are fine.  

However with discrete data this can sometimes give slightly ridiculous intervals. Integers are a case in point. If we have numbers from 0 to 99, pd.cut makes intervals like (-0.099, 14.1429],  (14.1429, 28.2857].  

Here is a MRE:  

```
import pandas as pd
import numpy as np

interval_testing = pd.DataFrame(columns=['data', 'interval'],)

interval_testing.data = np.arange(0,100).astype(int)

interval_testing.interval = pd.cut(interval_testing.data, bins=7, precision=4, )
# interval_testing.interval = pd.qcut(interval_testing.data, q=7, precision=5, )

interval_testing.groupby('interval').aggregate(['min', 'max', 'count'])
```

which outputs the following:  
![image](https://user-images.githubusercontent.com/118690308/220360035-a047020e-bd44-4184-9b62-c2df378d9def.png)

It would be great if there was an option to specify that data is discrete, so that intervals would be like (14, 28], (28, 42] etc. (Not just integers - for example data which is measured to one dp, it would give (14.3, 28.7], (28.7, 42.4] or similar). The level of discrete-ness can be inferred from the data. [EDIT: ATM you can achieve sthg similar using the `precision=` parameter, but this doesn't automatically infer from the data. The main point of this suggestion is the next point, that intervals should be closed].

A further improvement would then be a parameter to control that the intervals should be fully closed. So [15, 28], [29, 42] etc. 

For example if the data is "number of times something happened", intervals like those described would be more intuitive. 

It isn't particularly hard to work round this, but this might be a useful feature to add.



### Feature Description

Some rough code:

```
import pandas as pd
import numpy as np

def integer_qcut(x, q):
    binned_df, bins = pd.qcut(x, q, duplicates='drop', retbins=True)
    bins = np.floor(bins).astype(int)
    bins_left  = bins[:-1]
    bins_right = bins[1:] - np.array([1]*(len(bins)-2) + [0])
    bins = pd.IntervalIndex.from_arrays(left=bins_left, right=bins_right, closed='both')
    return pd.cut(x=x, bins=bins)
    # return  bins, #quantiles


interval_testing = pd.DataFrame(columns=['data', 'interval'],)
interval_testing.data = np.arange(0,100).astype(int)
interval_testing.interval = integer_qcut(interval_testing.data, q=7,  )

interval_testing.groupby('interval').aggregate(['min', 'max', 'count'])
```

gives 

![image](https://user-images.githubusercontent.com/118690308/220364424-962c76f9-f73e-4d66-b75e-fb338a33393d.png)

Not perfect because the final group is larger than the others (as a result of using np.floor), but illustrates what I'm getting at.

### Alternative Solutions

potentially multiple ways of solving this...

### Additional Context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: pd.cut closed intervals #51534

Feature Type

Problem Description

Feature Description

Alternative Solutions

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ENH: pd.cut closed intervals #51534

Description

Feature Type

Problem Description

Feature Description

Alternative Solutions

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions