Skip to content

df.to_stata should automatically write in format 117 with wide strings #23564

Closed
@kylebarron

Description

@kylebarron

Code Sample, a copy-pastable example if possible

import pandas as pd
df = pd.DataFrame({'a': ['x' * 250]})
df.to_stata('test')

Problem description

The last line above raises an exception:

ValueError:
Fixed width strings in Stata .dta files are limited to 244 (or fewer)
characters.  Column 'a' does not satisfy this restriction.

but is solved with:

df.to_stata('test', version=117)

This functionality (writing in dta format 117) was added in version 0.23. In my opinion, the Stata writer should automatically switch to version 117 if one of the columns is wider than 244 characters. At the least, the error message should be changed to note that as of version 0.23, it's possible to write long strings to Stata files by adding version=117.

I'd be happy to submit a PR if this functionality is desired.

Expected Output

Stata file written to disk.

Output of pd.show_versions()

pd.show_versions()
No module named 'dask'

INSTALLED VERSIONS
------------------
commit: None
python: 3.6.7.final.0
python-bits: 64
OS: Darwin
OS-release: 18.2.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.24.0.dev0+948.g82120016e
pytest: 3.10.0
pip: 18.1
setuptools: 40.5.0
Cython: 0.29
numpy: 1.15.4
scipy: 1.1.0
pyarrow: 0.11.1
xarray: 0.10.9
IPython: 7.1.1
sphinx: 1.8.1
patsy: 0.5.1
dateutil: 2.7.5
pytz: 2018.7
blosc: None
bottleneck: 1.2.1
tables: 3.4.4
numexpr: 2.6.8
feather: None
matplotlib: 3.0.1
openpyxl: 2.5.9
xlrd: 1.1.0
xlwt: 1.2.0
xlsxwriter: 1.1.2
lxml: 4.2.5
bs4: 4.6.3
html5lib: 1.0.1
sqlalchemy: 1.2.13
pymysql: 0.9.2
psycopg2: None
jinja2: 2.10
s3fs: 0.1.6
fastparquet: 0.1.6
pandas_gbq: None
pandas_datareader: None
gcsfs: 0.1.2

cc: @bashtage

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions