Closed
Description
Code Sample, a copy-pastable example if possible
import pandas as pd
df = pd.DataFrame({'a': ['x' * 250]})
df.to_stata('test')
Problem description
The last line above raises an exception:
ValueError:
Fixed width strings in Stata .dta files are limited to 244 (or fewer)
characters. Column 'a' does not satisfy this restriction.
but is solved with:
df.to_stata('test', version=117)
This functionality (writing in dta
format 117) was added in version 0.23. In my opinion, the Stata writer should automatically switch to version 117 if one of the columns is wider than 244 characters. At the least, the error message should be changed to note that as of version 0.23, it's possible to write long strings to Stata files by adding version=117
.
I'd be happy to submit a PR if this functionality is desired.
Expected Output
Stata file written to disk.
Output of pd.show_versions()
pd.show_versions()
No module named 'dask'
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.7.final.0
python-bits: 64
OS: Darwin
OS-release: 18.2.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.24.0.dev0+948.g82120016e
pytest: 3.10.0
pip: 18.1
setuptools: 40.5.0
Cython: 0.29
numpy: 1.15.4
scipy: 1.1.0
pyarrow: 0.11.1
xarray: 0.10.9
IPython: 7.1.1
sphinx: 1.8.1
patsy: 0.5.1
dateutil: 2.7.5
pytz: 2018.7
blosc: None
bottleneck: 1.2.1
tables: 3.4.4
numexpr: 2.6.8
feather: None
matplotlib: 3.0.1
openpyxl: 2.5.9
xlrd: 1.1.0
xlwt: 1.2.0
xlsxwriter: 1.1.2
lxml: 4.2.5
bs4: 4.6.3
html5lib: 1.0.1
sqlalchemy: 1.2.13
pymysql: 0.9.2
psycopg2: None
jinja2: 2.10
s3fs: 0.1.6
fastparquet: 0.1.6
pandas_gbq: None
pandas_datareader: None
gcsfs: 0.1.2
cc: @bashtage