Description
Code Sample
import pandas as pd
import numpy as np
#working data
names = ['d1', 'd2', 'd3', 'd4', 'd5']
formats = ['u1', '<f8', 'u1', 'u1', 'u1']
dtype = dict(names=names, formats=formats)
data = {'d1':[1,11], 'd2':[2,12], 'd3':[3,13], 'd4':[4,14], 'd5':[5,15]}
#create a pandas dataframe with uint8 variables except for a double in d2 slot.
df_create = np.rec.fromarrays(data.values(), dtype=dtype, names=data.keys())
df_create = pd.DataFrame(df_create)
df_create.loc[:,'d2'] *= 0.12345
#create a pandas dataframe with all variables uint8
df_mod = pd.DataFrame.from_dict(data, dtype=np.dtype('u1'))
#convert d2 to double and modify
df_mod.loc[:,'d2'] = df_mod.loc[:,'d2'].astype(np.dtype('float64'))
df_mod.loc[:,'d2'] *= 0.12345
print('type of df_create.loc[0,\'d1\']: {}'.format(type(df_create.loc[0,'d1'])))
print('type of df_create.loc[0,\'d2\']: {}'.format(type(df_create.loc[0,'d2'])))
print('type of df_create.iloc[0,0]: {}'.format(type(df_create.iloc[0,2])))
print('type of df_create.iloc[0,1]: {}'.format(type(df_create.iloc[0,1])))
print('type of df_create.ix[0,0]: {}'.format(type(df_create.iloc[0,2])))
print('type of df_create.ix[0,1]: {}'.format(type(df_create.iloc[0,1])))
print('')
print('type of df_mod.loc[0,\'d1\']: {}'.format(type(df_mod.loc[0,'d1'])))
print('type of df_mod.loc[0,\'d2\']: {}'.format(type(df_mod.loc[0,'d2'])))
print('type of df_mod.iloc[0,0]: {}'.format(type(df_mod.iloc[0,2])))
print('type of df_mod.iloc[0,1]: {}'.format(type(df_mod.iloc[0,1])))
print('type of df_mod.ix[0,0]: {}'.format(type(df_mod.iloc[0,2])))
print('type of df_mod.ix[0,1]: {}'.format(type(df_mod.iloc[0,1])))
print('')
print('All dtypes for dataframe df_mod:')
print(df_mod.dtypes)
print('')
pd.show_versions()
produces the following output:
type of df_create.loc[0,'d1']: <class 'numpy.float64'>
type of df_create.loc[0,'d2']: <class 'numpy.float64'>
type of df_create.iloc[0,0]: <class 'numpy.float64'>
type of df_create.iloc[0,1]: <class 'numpy.float64'>
type of df_create.ix[0,0]: <class 'numpy.float64'>
type of df_create.ix[0,1]: <class 'numpy.float64'>
type of df_mod.loc[0,'d1']: <class 'numpy.float64'>
type of df_mod.loc[0,'d2']: <class 'numpy.float64'>
type of df_mod.iloc[0,0]: <class 'numpy.float64'>
type of df_mod.iloc[0,1]: <class 'numpy.float64'>
type of df_mod.ix[0,0]: <class 'numpy.float64'>
type of df_mod.ix[0,1]: <class 'numpy.float64'>
All dtypes for dataframe df_mod:
d1 uint8
d2 float64
d3 uint8
d4 uint8
d5 uint8
dtype: object
Problem description
As far as I can see, the indexing mechanism is converting non-float64s to float64s. According to the dtypes, the internal representation of the data remains uint8, but when exposed by some indexing mechanism, the uint8 data is converted to float64.
Expected Output
type of df_create.loc[0,'d1']: <class 'numpy.uint8'>
type of df_create.loc[0,'d2']: <class 'numpy.float64'>
type of df_create.iloc[0,0]: <class 'numpy.uint8'>
type of df_create.iloc[0,1]: <class 'numpy.float64'>
type of df_create.ix[0,0]: <class 'numpy.uint8'>
type of df_create.ix[0,1]: <class 'numpy.float64'>
type of df_mod.loc[0,'d1']: <class 'numpy.uint8'>
type of df_mod.loc[0,'d2']: <class 'numpy.float64'>
type of df_mod.iloc[0,0]: <class 'numpy.uint8'>
type of df_mod.iloc[0,1]: <class 'numpy.float64'>
type of df_mod.ix[0,0]: <class 'numpy.uint8'>
type of df_mod.ix[0,1]: <class 'numpy.float64'>
Output of pd.show_versions()
commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 30 Stepping 5, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.19.2
nose: 1.3.7
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.23.4
numpy: 1.12.0
scipy: 0.18.0
statsmodels: 0.6.1
xarray: 0.8.2
IPython: 4.0.1
sphinx: 1.3.1
patsy: 0.4.0
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.1.0
tables: 3.2.2
numexpr: 2.6.1
matplotlib: 1.5.1
openpyxl: 2.4.0
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.7.7
lxml: 3.4.4
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.9
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.38.0
pandas_datareader: None