Open
Description
dataframe.copy() should happen atomically/be threadsafe, meaning that it should produce a consistent dataframe even if the call to .copy() is made while another thread is deleting entries from the dataframe, or if another thread calls a deletion method while the call to .copy() is working (in other words, i guess .copy() should acquire a lock that prevents mutation during the copy). That is, the following code, which crashes in 0.7.3, should succeed:
import pandas
import threading
df = pandas.DataFrame()
def mutateDf(df):
while True:
df[0] = pandas.Series([1,2,3])
del df[0]
def readDf(df):
while True:
dfCopy = df.copy()
if 0 in dfCopy and 1 in dfCopy[0]:
a = dfCopy[0][1]
t1 = threading.Thread(target=mutateDf, args=(df,))
t2 = threading.Thread(target=readDf, args=(df,))
t1.start()
t2.start()
Exception in thread Thread-3:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 504, in run
self.__target(*self.__args, **self.__kwargs)
File "<ipython-input-5-8aef72c7f1b4>", line 4, in readDf
if 0 in dfCopy and 1 in dfCopy[0]:
File "/usr/local/lib/python2.7/dist-packages/pandas-0.7.3-py2.7-linux-x86_64.egg/pandas/core/frame.py", line 1458, in __getitem__
return self._get_item_cache(key)
File "/usr/local/lib/python2.7/dist-packages/pandas-0.7.3-py2.7-linux-x86_64.egg/pandas/core/generic.py", line 294, in _get_item_cache
values = self._data.get(item)
File "/usr/local/lib/python2.7/dist-packages/pandas-0.7.3-py2.7-linux-x86_64.egg/pandas/core/internals.py", line 625, in get
_, block = self._find_block(item)
TypeError: 'NoneType' object is not iterable