Description
Code Sample, a copy-pastable example if possible
import pandas as pd
import numpy as np
for dtype in (np.float32, np.int32, np.bool):
s = pd.SparseSeries([1, 0], dtype=dtype)
print('original:', s.dtype)
print('reindexed:', s.reindex([0, 1, 2]).dtype)
Output
original: Sparse[float32, nan]
reindexed: Sparse[float64, nan]
original: Sparse[int32, 0]
reindexed: Sparse[float64, 0]
original: Sparse[bool, False]
reindexed: Sparse[float64, False]
Problem description
The output sparse series is always of type Sparse[float64]
instead of the dtype passed in.
This looks to be a regression from v0.23.4. Perhaps it has to do with the new SparseArray rework in v0.24.x.
Expected Output
Ideally the dtype would not be lost. Although that wasn't exactly true in v0.23.4, the sparse float dtypes were not upcasted. My use case for casting to a sparse dtype is to save space, so the conversion to float64 breaks things.
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.7.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.13.0-16-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.24.2
pytest: 4.3.1
pip: 19.0.3
setuptools: 40.8.0
Cython: 0.29.6
numpy: 1.16.2
scipy: 1.2.1
pyarrow: 0.12.0
xarray: None
IPython: 7.3.0
sphinx: 1.8.5
patsy: 0.5.1
dateutil: 2.8.0
pytz: 2018.9
blosc: None
bottleneck: 1.2.1
tables: 3.5.1
numexpr: 2.6.9
feather: None
matplotlib: 2.2.3
openpyxl: 2.6.1
xlrd: 1.2.0
xlwt: 1.3.0
xlsxwriter: 1.1.5
lxml.etree: 4.3.2
bs4: 4.7.1
html5lib: 1.0.1
sqlalchemy: 1.3.1
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None