Description
Code Sample, a copy-pastable example if possible
import pandas as pd
pd.Series([(1,2), (3,4)]).memory_usage(deep=True)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/roy/ftpyenv/clean/lib/python3.7/site-packages/pandas/core/series.py", line 3913, in memory_usage
v = super(Series, self).memory_usage(deep=deep)
File "/home/roy/ftpyenv/clean/lib/python3.7/site-packages/pandas/core/base.py", line 1410, in memory_usage
v += lib.memory_usage_of_objects(self.array)
File "pandas/_libs/lib.pyx", line 101, in pandas._libs.lib.memory_usage_of_objects
File "/home/roy/ftpyenv/clean/lib/python3.7/site-packages/pandas/core/arrays/numpy_.py", line 226, in __getitem__
result = type(self)(result)
File "/home/roy/ftpyenv/clean/lib/python3.7/site-packages/pandas/core/arrays/numpy_.py", line 131, in __init__
raise ValueError("'values' must be a NumPy array.")
ValueError: 'values' must be a NumPy array.
It seems to be related to calling self.array
on the series which might makes sense given PandasArray is supposed be 1-dimensional
pd.array([(1,2), (3,4)])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/roy/ftpyenv/clean/lib/python3.7/site-packages/pandas/core/arrays/array_.py", line 273, in array
result = PandasArray._from_sequence(data, dtype=dtype, copy=copy)
File "/home/roy/ftpyenv/clean/lib/python3.7/site-packages/pandas/core/arrays/numpy_.py", line 150, in _from_sequence
return cls(result)
File "/home/roy/ftpyenv/clean/lib/python3.7/site-packages/pandas/core/arrays/numpy_.py", line 134, in __init__
raise ValueError("PandasArray must be 1-dimensional.")
ValueError: PandasArray must be 1-dimensional.
This also happens with Python objects
class Dummy(object):
def __init__(self):
pass
pd.array([Dummy(), Dummy()])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/roy/ftpyenv/clean/lib/python3.7/site-packages/pandas/core/arrays/base.py", line 848, in __repr__
indent_for_name=False).rstrip(', \n')
File "/home/roy/ftpyenv/clean/lib/python3.7/site-packages/pandas/io/formats/printing.py", line 348, in format_object_summary
first = formatter(obj[0])
File "/home/roy/ftpyenv/clean/lib/python3.7/site-packages/pandas/core/arrays/numpy_.py", line 226, in __getitem__
result = type(self)(result)
File "/home/roy/ftpyenv/clean/lib/python3.7/site-packages/pandas/core/arrays/numpy_.py", line 131, in __init__
raise ValueError("'values' must be a NumPy array.")
ValueError: 'values' must be a NumPy array.
Problem description
The current behavior is a problem because if a dataframe has a column with non-standard data like tuples the .memory_usage(deep=True)
call fails.
Expected Output
pd.Series([(1,2)]).memory_usage(deep=True)
176
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.7.1.final.0
python-bits: 64
OS: Linux
OS-release: 4.15.0-43-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.24.0
pytest: 3.8.2
pip: 18.1
setuptools: 40.6.2
Cython: None
numpy: 1.16.0
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 7.1.1
sphinx: 1.8.1
patsy: None
dateutil: 2.7.5
pytz: 2018.7
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 3.0.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: 0.2.0
fastparquet: 0.2.1
pandas_gbq: None
pandas_datareader: None
gcsfs: None