Closed
Description
# Your code here
import pandas as pd
import numpy as np
import time
print pd.__version__
iterables = [['bar', 'baz', 'foo', 'qux'], ['one', 'two']]
multind=pd.MultiIndex.from_product(iterables, names=['first', 'second'])
df = pd.DataFrame(np.random.randn(4, 8), columns=multind)
df2 = pd.DataFrame(np.random.randn(4, 8), columns=multind)
t2=time.time()
df.combine_first(df2)
print "%f" % (time.time()-t2)
Problem description
Running this same code takes 116 ms in version 0.20.1
however it takes 3.6 ms in version 0.19.2.
This makes version 0.20.1 more than 30 times slower than 0.19.2 for this method.
Output of pd.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 2.7.13.final.0
python-bits: 64
OS: Darwin
OS-release: 15.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: C
LANG: fr_FR.UTF-8
LOCALE: fr_FR.UTF-8
pandas: 0.20.1
pytest: 2.8.5
pip: 9.0.1
setuptools: 19.6.2
Cython: 0.24.1
numpy: 1.11.3
scipy: 0.18.1
xarray: None
IPython: 5.1.0
sphinx: 1.3.5
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.4
blosc: None
bottleneck: 1.2.0
tables: 3.2.2
numexpr: 2.6.1
feather: None
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.4
lxml: 3.5.0
bs4: 4.4.1
html5lib: 0.9999999
sqlalchemy: 1.0.13
pymysql: None
psycopg2: 2.6.1 (dt dec pq3 ext)
jinja2: 2.8
s3fs: 0.0.9
pandas_gbq: None
pandas_datareader: None