Description
pandas.concat crashes the python interpreter
The program snippet below crashes the python interpreter. I have run the snippet with:
- Pandas 0.19.2,
- Python 3.6.1 and Python 3.5.2,
- Windows 10 and Windows 8.1
- in a straigth python interpreter and in the ipython program
all with the same result (sometimes I have run it twice and it crashes the next time, don't know why).
import pandas as pd
categories = ['Afghanistan', 'Albania', 'Algeria', 'Andorra', 'Angola', 'Antigua & Deps', 'Argentina', 'Armenia', 'Australia', 'Austria', 'Azerbaijan', 'Bahamas', 'Bahrain', 'Bangladesh', 'Barbados', 'Belarus', 'Belgium', 'Benin', 'Bhutan', 'Bolivia', 'Bosnia Herzegovina', 'Botswana', 'Brazil', 'Brunei', 'Bulgaria', 'Burkina', 'Cambodia', 'Cameroon', 'Canada', 'Chile', 'China', 'Colombia', 'Congo {Democratic Rep}', 'Costa Rica', 'Croatia', 'Cuba', 'Cyprus', 'Czech Republic', 'Denmark', 'Djibouti', 'Dominican Republic', 'Ecuador', 'Egypt', 'El Salvador', 'Estonia', 'Ethiopia', 'Finland', 'France', 'Gabon', 'Georgia', 'Germany', 'Ghana', 'Greece', 'Guatemala', 'Haiti', 'Honduras', 'Hungary', 'Iceland', 'India', 'Indonesia', 'Iran', 'Iraq', 'Ireland {Republic}', 'Israel', 'Italy', 'Jamaica', 'Japan', 'Jordan', 'Kazakhstan', 'Kenya', 'Korea North', 'Korea South', 'Kosovo', 'Kuwait', 'Kyrgyzstan', 'Laos', 'Latvia', 'Lebanon', 'Liechtenstein', 'Lithuania', 'Luxembourg', 'Macedonia', 'Madagascar', 'Malaysia', 'Maldives', 'Malta', 'Mauritania', 'Mauritius', 'Mexico', 'Moldova', 'Mongolia', 'Montenegro', 'Morocco', 'Mozambique', 'Myanmar, {Burma}', 'Namibia', 'Nepal', 'Netherlands', 'New Zealand', 'Nicaragua', 'Nigeria', 'Norway', 'Oman', 'Other', 'Pakistan', 'Panama', 'Paraguay', 'Peru', 'Philippines', 'Poland', 'Portugal', 'Qatar', 'Romania', 'Russian Federation', 'Rwanda', 'San Marino', 'Saudi Arabia', 'Senegal', 'Serbia', 'Sierra Leone', 'Singapore', 'Slovakia', 'Slovenia', 'Solomon Islands', 'Somalia', 'South Africa', 'South Sudan', 'Spain', 'Sri Lanka', 'Sudan', 'Swaziland', 'Sweden', 'Switzerland', 'Syria', 'Taiwan', 'Tajikistan', 'Tanzania', 'Thailand', 'Togo', 'Trinidad & Tobago', 'Tunisia', 'Turkey', 'Turkmenistan', 'Uganda', 'Ukraine', 'United Arab Emirates', 'United Kingdom', 'United States', 'Uruguay', 'Uzbekistan', 'Vanuatu', 'Vatican City', 'Venezuela', 'Vietnam', 'Zambia', 'Zimbabwe', "Didn't answer"]
series0_values = [9, 1, 8, 18, 8, 85, 49, 3, 1, 2, 14, 1, 7, 40, 2, 5, 64, 1, 15, 5, 1, 116, 7, 43, 8, 6, 13, 2, 2, 32, 35, 2, 1, 23, 1, 5, 1, 43, 112, 1, 9, 319, 1, 25, 3, 2, 30, 4, 455, 23, 60, 1, 37, 37, 106, 1, 13, 6, 5, 5, 1, 32, 1, 10, 8, 14, 5, 1, 11, 1, 6, 1, 39, 4, 1, 4, 2, 7, 124, 30, 1, 4, 28, 1, 31, 2, 20, 130, 39, 29, 97, 7, 17, 1, 17, 11, 15, 23, 1, 146, 11, 1, 75, 55, 4, 9, 10, 1, 1, 4, 54, 1, 3, 34, 3, 299, 587, 7, 1, 10, 17, 2, 65]
series0_index = ['Afghanistan', 'Albania', 'Algeria', 'Argentina', 'Armenia', 'Australia', 'Austria', 'Azerbaijan', 'Bahamas', 'Bahrain', 'Bangladesh', 'Barbados', 'Belarus', 'Belgium', 'Bolivia', 'Bosnia Herzegovina', 'Brazil', 'Brunei', 'Bulgaria', 'Cambodia', 'Cameroon', 'Canada', 'Chile', 'China', 'Colombia', 'Costa Rica', 'Croatia', 'Cuba', 'Cyprus', 'Czech Republic', 'Denmark', 'Dominican Republic', 'Ecuador', 'Egypt', 'El Salvador', 'Estonia', 'Ethiopia', 'Finland', 'France', 'Gabon', 'Georgia', 'Germany', 'Ghana', 'Greece', 'Guatemala', 'Honduras', 'Hungary', 'Iceland', 'India', 'Indonesia', 'Iran', 'Iraq', 'Ireland {Republic}', 'Israel', 'Italy', 'Jamaica', 'Japan', 'Jordan', 'Kazakhstan', 'Kenya', 'Korea North', 'Korea South', 'Kyrgyzstan', 'Latvia', 'Lebanon', 'Lithuania', 'Macedonia', 'Madagascar', 'Malaysia', 'Maldives', 'Malta', 'Mauritius', 'Mexico', 'Moldova', 'Mongolia', 'Morocco', 'Myanmar, {Burma}', 'Nepal', 'Netherlands', 'New Zealand', 'Nicaragua', 'Nigeria', 'Norway', 'Oman', 'Pakistan', 'Peru', 'Philippines', 'Poland', 'Portugal', 'Romania', 'Russian Federation', 'Saudi Arabia', 'Serbia', 'Sierra Leone', 'Singapore', 'Slovakia', 'Slovenia', 'South Africa', 'South Sudan', 'Spain', 'Sri Lanka', 'Sudan', 'Sweden', 'Switzerland', 'Syria', 'Taiwan', 'Thailand', 'Togo', 'Trinidad & Tobago', 'Tunisia', 'Turkey', 'Turkmenistan', 'Uganda', 'Ukraine', 'United Arab Emirates', 'United Kingdom', 'United States', 'Uruguay', 'Uzbekistan', 'Venezuela', 'Vietnam', 'Zimbabwe', "Didn't answer"]
i0 = pd.CategoricalIndex(series0_index, ordered=True, categories=categories)
s0 = pd.Series(series0_values, index=i0)
series1_values = [4, 18, 16, 65, 33, 2, 8, 1, 12, 35, 3, 3, 46, 6, 1, 89, 4, 17, 8, 6, 3, 2, 1, 22, 30, 3, 13, 1, 9, 1, 35, 109, 7, 197, 1, 14, 2, 17, 233, 9, 28, 16, 36, 56, 5, 1, 3, 4, 6, 14, 3, 8, 3, 3, 3, 14, 2, 1, 3, 1, 3, 5, 86, 11, 1, 3, 20, 12, 2, 1, 1, 7, 93, 20, 18, 61, 3, 14, 1, 5, 9, 5, 1, 19, 80, 4, 1, 75, 29, 2, 6, 11, 1, 2, 34, 45, 6, 281, 580, 6, 1, 5, 10, 1, 36]
series1_index = ['Afghanistan', 'Argentina', 'Armenia', 'Australia', 'Austria', 'Bahrain', 'Bangladesh', 'Barbados', 'Belarus', 'Belgium', 'Bolivia', 'Bosnia Herzegovina', 'Brazil', 'Bulgaria', 'Cambodia', 'Canada', 'Chile', 'China', 'Colombia', 'Costa Rica', 'Croatia', 'Cuba', 'Cyprus', 'Czech Republic', 'Denmark', 'Ecuador', 'Egypt', 'El Salvador', 'Estonia', 'Ethiopia', 'Finland', 'France', 'Georgia', 'Germany', 'Ghana', 'Greece', 'Guatemala', 'Hungary', 'India', 'Indonesia', 'Iran', 'Ireland {Republic}', 'Israel', 'Italy', 'Japan', 'Jordan', 'Kazakhstan', 'Kenya', 'Korea South', 'Latvia', 'Lebanon', 'Lithuania', 'Macedonia', 'Malaysia', 'Malta', 'Mexico', 'Moldova', 'Mongolia', 'Morocco', 'Mozambique', 'Myanmar, {Burma}', 'Nepal', 'Netherlands', 'New Zealand', 'Nicaragua', 'Nigeria', 'Norway', 'Pakistan', 'Panama', 'Paraguay', 'Peru', 'Philippines', 'Poland', 'Portugal', 'Romania', 'Russian Federation', 'Saudi Arabia', 'Serbia', 'Sierra Leone', 'Singapore', 'Slovakia', 'Slovenia', 'Solomon Islands', 'South Africa', 'Spain', 'Sri Lanka', 'Swaziland', 'Sweden', 'Switzerland', 'Syria', 'Taiwan', 'Thailand', 'Trinidad & Tobago', 'Tunisia', 'Turkey', 'Ukraine', 'United Arab Emirates', 'United Kingdom', 'United States', 'Uruguay', 'Uzbekistan', 'Venezuela', 'Vietnam', 'Zimbabwe', "Didn't answer"]
i1 = pd.CategoricalIndex(series1_index, ordered=True, categories=categories)
s1 = pd.Series(series1_values, index=i1)
series2_values = [7, 1, 6, 2, 49, 26, 1, 1, 1, 25, 3, 18, 9, 63, 1, 2, 5, 1, 3, 2, 17, 23, 4, 3, 15, 55, 2, 2, 172, 7, 1, 17, 1, 41, 4, 6, 13, 8, 61, 2, 4, 1, 4, 2, 3, 1, 3, 2, 9, 1, 1, 63, 13, 16, 3, 1, 4, 59, 11, 3, 22, 2, 5, 1, 2, 2, 7, 8, 50, 2, 39, 35, 2, 1, 14, 6, 1, 191, 312, 5, 1, 1, 1, 32]
series2_index = ['Afghanistan', 'Angola', 'Argentina', 'Armenia', 'Australia', 'Austria', 'Azerbaijan', 'Bangladesh', 'Belarus', 'Belgium', 'Bosnia Herzegovina', 'Brazil', 'Bulgaria', 'Canada', 'Chile', 'China', 'Colombia', 'Costa Rica', 'Croatia', 'Cyprus', 'Czech Republic', 'Denmark', 'Egypt', 'Estonia', 'Finland', 'France', 'Gabon', 'Georgia', 'Germany', 'Greece', 'Honduras', 'Hungary', 'Iceland', 'India', 'Indonesia', 'Iran', 'Ireland {Republic}', 'Israel', 'Italy', 'Kazakhstan', 'Korea South', 'Laos', 'Latvia', 'Lebanon', 'Lithuania', 'Luxembourg', 'Malaysia', 'Malta', 'Mexico', 'Mongolia', 'Nepal', 'Netherlands', 'New Zealand', 'Norway', 'Pakistan', 'Paraguay', 'Philippines', 'Poland', 'Portugal', 'Romania', 'Russian Federation', 'Saudi Arabia', 'Serbia', 'Sierra Leone', 'Singapore', 'Slovakia', 'Slovenia', 'South Africa', 'Spain', 'Sri Lanka', 'Sweden', 'Switzerland', 'Taiwan', 'Thailand', 'Turkey', 'Ukraine', 'United Arab Emirates', 'United Kingdom', 'United States', 'Uruguay', 'Venezuela', 'Vietnam', 'Zimbabwe', "Didn't answer"]
i2 = pd.CategoricalIndex(series2_index, ordered=True, categories=categories)
s2 = pd.Series(series2_values, index=i2)
print("Before")
x = pd.concat([s0, s1, s2], axis=1) # crash!
print("After")
pd.concat([s0, s1, s2], axis=1)
crashes the interpreter for me. If only one or two series are concatenated, no crash occurs
Problem description
The interpreter just exits with a message in a message box (In danish, but translated to An error caused Python to exit
) and the program shuts down. There is no traceback or other information about the cause (that I know how to get at - If I'm instructed to how to get it, I may be able to fetch it).
Expected Output
pd.concat
should return a three-column DataFrame. Instead the interpreter crashes.
Output of pd.show_versions()
commit: None
python: 3.6.1.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 78 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.19.2
nose: None
pip: 9.0.1
setuptools: 35.0.1
Cython: None
numpy: 1.11.2
scipy: 0.18.1
statsmodels: None
xarray: None
IPython: 6.0.0
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 2.0.0
openpyxl: 2.4.5
xlrd: None
xlwt: None
xlsxwriter: None
lxml: 3.7.3
bs4: 4.5.3
html5lib: 0.999999999
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
boto: None
pandas_datareader: None