Skip to content

BUG: unsupported type Interval when writing dataframe to excel #19244

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 27, 2018

Conversation

cbertinato
Copy link
Contributor

@cbertinato cbertinato commented Jan 15, 2018

@codecov
Copy link

codecov bot commented Jan 15, 2018

Codecov Report

Merging #19244 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #19244   +/-   ##
=======================================
  Coverage   91.65%   91.65%           
=======================================
  Files         150      150           
  Lines       48724    48724           
=======================================
  Hits        44658    44658           
  Misses       4066     4066
Flag Coverage Δ
#multiple 90.02% <ø> (ø) ⬆️
#single 41.74% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5f6c80b...8859dc5. Read the comment docs.

@@ -478,3 +478,4 @@ Other
^^^^^

- Improved error message when attempting to use a Python keyword as an identifier in a ``numexpr`` backed query (:issue:`18221`)
- Bug in :func:`DataFrame.to_excel` where an exception was raised indicating unsupported type when writing a data frame containing a column of Interval type (:issue:`19242`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to IO. just say unsupported type

@@ -788,6 +789,8 @@ def _conv_value(val):
val = "{val}".format(val=val)
elif is_list_like(val):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this routine needs some re-factoring, something like

   # doc-string here
   if is_integer(val):
        val = int(val)
    elif is_float(val):
        val = float(val)
    elif is_bool(val):
        val = bool(val)
   elif isinstance(val, (datetime, timedelta)):
        pass
    else:
       # comment here
        val = str(val)

    return val

with some additional tests

Copy link
Contributor Author

@cbertinato cbertinato Jan 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This raises another issue. I take your meaning here since datetime and date are treated later on in write_cells, but timedelta is not explicitly dealt with. Turns out that writing timedelta is not supported when exporting .xls. Looks like this is an old issue (#9155).

I suggest making the same conversion that xlsxwriter makes in the appropriate write_cells.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cbertinato yes this function actually could be moved to the top-level.

@jreback jreback added Dtype Conversions Unexpected or buggy dtype conversions IO Excel read_excel, to_excel labels Jan 15, 2018
@pep8speaks
Copy link

pep8speaks commented Jan 16, 2018

Hello @cbertinato! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on January 27, 2018 at 02:19 Hours UTC

@@ -418,6 +418,8 @@ I/O
- Bug in :func:`read_sas` where a file with 0 variables gave an ``AttributeError`` incorrectly. Now it gives an ``EmptyDataError`` (:issue:`18184`)
- Bug in :func:`DataFrame.to_latex()` where pairs of braces meant to serve as invisible placeholders were escaped (:issue:`18667`)
- Bug in :func:`read_json` where large numeric values were causing an ``OverflowError`` (:issue:`18842`)
- Type ``Interval`` now supported in :func:`DataFrame.to_excel` for all Excel file types (:issue:`19242`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use :class:`Interval` , and :class`Timdelta` (this is more user friendly as you almost always have a Timedelta)

@@ -21,7 +21,7 @@
from pandas.io.common import (_is_url, _urlopen, _validate_header_arg,
get_filepath_or_buffer, _NA_VALUES,
_stringify_path)
from pandas.core.indexes.period import Period
# from pandas.core.indexes.period import Period
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import no longer necessary because the elif on line 787 was folded into the else. I'll remove it instead of commenting.

@@ -1461,6 +1464,12 @@ def write_cells(self, cells, sheet_name=None, startrow=0, startcol=0,
elif isinstance(cell.val, date):
num_format_str = self.date_format

if isinstance(cell.val, timedelta):
delta = cell.val
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just use delta.total_seconds()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be an elif

@@ -1440,6 +1440,62 @@ def test_excel_date_datetime_format(self):
# to use df_expected to check the result
tm.assert_frame_equal(rs2, df_expected)

# GH19242 - test writing Interval without labels
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments go inside the function

rand = np.random.randint(-10, 10, size=(20, 2))
frame = DataFrame(rand)
intervals = pd.cut(frame[0], 10)
frame['new'] = intervals
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make this a little less verbose
use expected for the comparison variable

""" Convert numpy types to Python types for the Excel writers.
:obj:`datetime` and :obj:`date` formatting must be handled in the
writer. :obj:`str` representation is returned for all other types.
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Write an expanded docstring with Parameters and Returns section if you're going to do this.

@jreback jreback added this to the 0.23.0 milestone Jan 24, 2018
@jreback
Copy link
Contributor

jreback commented Jan 24, 2018

small doc string comments, and pls rebase. ping on green.

@cbertinato cbertinato force-pushed the issue-19242 branch 2 times, most recently from 1464ec0 to 9b89a6e Compare January 26, 2018 01:26
@cbertinato
Copy link
Contributor Author

Looks like the AppVeyor failures are in the parquet tests. Are those coming from the rebase?

@jreback
Copy link
Contributor

jreback commented Jan 27, 2018

can you rebase.

@jreback
Copy link
Contributor

jreback commented Jan 27, 2018

can you rebase. and ping on green.

@cbertinato
Copy link
Contributor Author

All green!

@jreback jreback merged commit 24d9509 into pandas-dev:master Jan 27, 2018
@jreback
Copy link
Contributor

jreback commented Jan 27, 2018

thanks @cbertinato ! keep em coming!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions IO Excel read_excel, to_excel
Projects
None yet
4 participants