Description
There is a problem when writing pickles with Pandas 0.14.1 and reading them with Pandas 0.17.1. The problem occurs if the pickles contain isodate.UTC
or other timezone objects.
The problem can be reduced to this test case:
my_tz.py:
from datetime import tzinfo
class MyTz(tzinfo):
def __init__(self):
pass
write_pickle_test.py (run with Pandas 0.14.1):
import pandas as pd
import cPickle
from my_tz import MyTz
data = pd.Series(), MyTz()
with open('test.pickle', 'wb') as f:
cPickle.dump(data, f)
read_pickle_test.py (run with Pandas 0.17.1):
import pandas as pd
pd.read_pickle('test.pickle')
Reading the pickle with pickle.load()
would fail when trying to load the Series
object:
TypeError: _reconstruct: First argument must be a sub-type of ndarray
which is why pd.read_pickle()
attempts to use pandas.compat.pickle_compat.load()
. But then we get this instead for the MyTz
object:
$ python read_pickle.py
Traceback (most recent call last):
File "read_pickle_test.py", line 2, in <module>
pd.read_pickle('test.pickle')
File "pandas/io/pickle.py", line 60, in read_pickle
return try_read(path)
File "pandas/io/pickle.py", line 57, in try_read
return pc.load(fh, encoding=encoding, compat=True)
File "pandas/compat/pickle_compat.py", line 116, in load
return up.load()
File "/usr/lib64/python2.7/pickle.py", line 858, in load
dispatch[key](self)
File "pandas/compat/pickle_compat.py", line 16, in load_reduce
if type(args[0]) is type:
IndexError: tuple index out of range
I see the following problems in pandas.compat.pickle_compat.load_reduce():
(1) It attemps to access args[0]
while args
can be empty.
if type(args[0]) is type:
n = args[0].__name__
(2) The above code is dead code anyway, since n
isn't referenced to anywhere else in the function. Removing those two lines fixes the unpickling problem completely.
Note that the MyTz
class in the test above does implement a proper __init__()
method as required in Python documentation for tzinfo: "Special requirement for pickling: A tzinfo subclass must have an __init__
method that can be called with no arguments, else it can be pickled but possibly not unpickled again. This is a technical requirement that may be relaxed in the future." While isodate.UTC
violates this requirement, that doesn't seem to be the cause of this problem.
(3) Also, at the very end of load_reduce()
:
stack[-1] = value
is code that is never reached, since all previous code branches either return or raise. Also, this line is invalid, since value
isn't initialized anywhere in the function.
I originally sent this as a comment to issue #6871, but since it's not exactly the same thing (only a similar traceback), I think it deserves a separate issue.