API: getting the "densified" values of a Categorical (preserving categories dtype) ?

Assume we have a Categorical, and want to convert to a dense array (not encoded). We have `np.asarray(..)` and the `to_dense()` method (which uses asarray under the hood):

```
In [1]: cat = pd.Categorical(['a', 'b', 'a'])

In [2]: np.asarray(cat) 
Out[2]: array(['a', 'b', 'a'], dtype=object)

In [3]: cat.to_dense() 
Out[3]: array(['a', 'b', 'a'], dtype=object)
```

In addition, we also have `get_values`:
```
In [4]: cat.get_values() 
Out[4]: array(['a', 'b', 'a'], dtype=object)
```

`get_values` is mostly the same, with the exception that returns an Index for datetime/period/timedelta, and an object array for integers if there are missing values instead of float array:

```
In [10]: cat = pd.Categorical(pd.date_range("2012", periods=3))

In [11]: cat.to_dense()
Out[11]: 
array(['2012-01-01T00:00:00.000000000', '2012-01-02T00:00:00.000000000',
       '2012-01-03T00:00:00.000000000'], dtype='datetime64[ns]')

In [12]: cat.get_values()
Out[12]: DatetimeIndex(['2012-01-01', '2012-01-02', '2012-01-03'], dtype='datetime64[ns]', freq='D')
```

With the result that it preserves somewhat more the dtype (although only specifically for datetime-like, it will not do it for any EA)

While looking into the deprecation of `get_values` (https://github.com/pandas-dev/pandas/pull/26409), I was wondering: do we want some method to actually get a "dense" version of the array, but with the exact same dtype? (so returning an EA in case the categories have an extension dtype)

And should we deprecate `to_dense()` ?



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API: getting the "densified" values of a Categorical (preserving categories dtype) ? #26410

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

API: getting the "densified" values of a Categorical (preserving categories dtype) ? #26410

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions