-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
ENH: Improve error message for repeated Stata categories #13949
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -397,6 +397,7 @@ Other enhancements | |||
- ``Series.append`` now supports the ``ignore_index`` option (:issue:`13677`) | |||
- ``.to_stata()`` and ``StataWriter`` can now write variable labels to Stata dta files using a dictionary to make column names to labels (:issue:`13535`, :issue:`13536`) | |||
- ``.to_stata()`` and ``StataWriter`` will automatically convert ``datetime64[ns]`` columns to Stata format ``%tc``, rather than raising a ``ValueError`` (:issue:`12259`) | |||
- ``read_stata()`` and ``StataReader`` return more explicit error message is given when reading Stata data files containing value labels are repeated when ``convert_categoricals=True`` (:issue:`13923`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"raise with a more explicit error message when reading Stata files with repeated value labels when convert_categoricals=True
".
aee7b63
to
72cc94c
Compare
Current coverage is 85.29% (diff: 100%)@@ master #13949 diff @@
==========================================
Files 139 139
Lines 50159 50169 +10
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 42786 42792 +6
- Misses 7373 7377 +4
Partials 0 0
|
@@ -1649,6 +1649,13 @@ def _do_convert_categoricals(self, data, value_label_dict, lbllist, | |||
categories.append(value_label_dict[label][category]) | |||
else: | |||
categories.append(category) # Partially labeled | |||
if len(categories) != len(set(categories)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its a bit more python to:
try:
cat_data.categories = categories
except ValueError:
# show your new ValueError
as non-unique category setting already raises.
72cc94c
to
533a9a0
Compare
Improve the error message to be more explicit when attempting to read Stata files containing repeated categories. closes pandas-dev#13923
533a9a0
to
0880d02
Compare
thanks! great as usual! |
git diff upstream/master | flake8 --diff
Improve the error message to be more explicit when attempting to read Stata
files containing repeated categories.
closes #13923