-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Fix index for datetime64 conversion. Fixes #13937 #14446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #14446 +/- ##
==========================================
- Coverage 91.06% 91.04% -0.03%
==========================================
Files 136 136
Lines 49104 49104
==========================================
- Hits 44718 44705 -13
- Misses 4386 4399 +13
Continue to review full report at Codecov.
|
.to_records() | ||
expected = np.rec.array([('x', 'y')], dtype=[('a', 'O'), ('b', 'O')]) | ||
tm.assert_almost_equal(result, expected) | ||
|
||
def test_to_records_with_tz_gmt(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a comment for the issue number
df = DataFrame({'datetime': data}) | ||
df.set_index('datetime', inplace=True) | ||
dfgmt = df.tz_convert("GMT") | ||
tm.assertIsNot(df.to_records()['datetime'][0].tzinfo, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
construct the expected frame and use assert_frrame_equal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also simplify the construction of the test data using pandas constructs?
Eg
data = [datetime.datetime.now(pytz.timezone('UTC')) for _ in range(10)]
df = DataFrame({'datetime': data})
df.set_index('datetime', inplace=True)
can be done with (for similar object, not exact same values of course)
df = DataFrame(index=date_range('2016-01-01', periods=10, freq='S', tz='UTC'))
pls add a whatsnew entry for 0.19.1 |
def test_to_records_with_tz_gmt(self): | ||
""" | ||
This test verifies the bug id #13937 | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add this as a comment instead of docstring? (docstrings hide the test name in the output of nosetests)
Also just the number is fine, like # GH 13937
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I'll take care of this. :)
df3 = DataFrame({'datetime': dt3}) | ||
df4 = df3.set_index('datetime') | ||
|
||
tm.assert_frame_equal(df2, df4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this test related to the bug you fixed?
It does not use to_records
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jorisvandenbossche Hmm. yes it is not related to my bug fix. I just focused on data frame equality. Actually there is also one problem assert_frame_equal does not accept any other frames expect DataFrame. In my previous PR 14044 Jeff suggest to construct equal data frame. Can you suggest me how i complete Jeff's assert_frame_equal requirement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jorisvandenbossche @jreback Can you please suggest me how i construct equal data frame?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your question is not fully clear to me.
If you want to test to_records
, you can also assert that the resulting array is equal (with assert_numpy_array_equal
instead of assert_frame_equal
)
df4 = DataFrame({'datetime': dt2}) | ||
df5 = df4.set_index('datetime') | ||
|
||
tm.assert_frame_equal(df3, df5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question for this test.
Was the above code failing before your PR? Or is this not yet tested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No this code does not falling before my PR or after the fixes.
df = DataFrame({'datetime': data}) | ||
df.set_index('datetime', inplace=True) | ||
dfgmt = df.tz_convert("GMT") | ||
tm.assertIsNot(df.to_records()['datetime'][0].tzinfo, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also simplify the construction of the test data using pandas constructs?
Eg
data = [datetime.datetime.now(pytz.timezone('UTC')) for _ in range(10)]
df = DataFrame({'datetime': data})
df.set_index('datetime', inplace=True)
can be done with (for similar object, not exact same values of course)
df = DataFrame(index=date_range('2016-01-01', periods=10, freq='S', tz='UTC'))
@amolkahat It looks like something went wrong with rebasing/updating. Normally, doing exactly the following should clean it up:
(supposing your fork is 'origin' and pandas-dev is 'upstream') |
@amolkahat Can you rebase? (see my previous comment how to fix it) |
@jorisvandenbossche Please take a look. |
can you add a whatsnew note for 0.20.0 |
freq='S', | ||
tz='UTC')) | ||
|
||
dfgmt = df.tz_convert("GMT") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use result = ....
can you construct a rec-array directly and use tm.assert_numpy_array_equal
def test_from_UTC_to_GMT_as_equal_frame(self): | ||
# GH13937 | ||
|
||
dt1 = [datetime.datetime(2011, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this last test is not necessary
df1 = DataFrame(index=dt1) | ||
df3 = df1.tz_convert("UTC") | ||
|
||
df4 = DataFrame(index=dt2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
however the empty frame is good (but instead of a separate tests, you can do this as part of the other one, IOW)
just use a starting from from the first test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try to consolidate these tests into a single test (but with multiple parts is ok), its just confusing to start with 3 different frames, start with 1, then do a
result = ....
expected = ....
tm.assert_numpy_array_equal(....)
# next part
....
# last part
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if you updated this. the tests needs to be more concise and focused.
doc/source/whatsnew/v0.20.0.txt
Outdated
@@ -355,3 +355,7 @@ Bug Fixes | |||
|
|||
|
|||
- Bug in ``pd.read_csv()`` for the C engine where ``usecols`` were being indexed incorrectly with ``parse_dates`` (:issue:`14792`) | |||
======= | |||
- Bug in ``DataFrame.to_records()`` with converting datetime64 index with timezone (:issue: `13937`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so looks like some merge conflicts are remaining here
8, | ||
15, | ||
8, | ||
15, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is really hard to read. please format better.
freq='S', | ||
tz='GMT')) | ||
|
||
tm.assert_almost_equal(df_utc.to_records()['datetime'][0], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this comparison add that the assert_numpy_array_equal
does not? (which is the way we would like to compare these)
index=data2) | ||
|
||
df_gmt2 = DataFrame(date_range('2016-01-01', | ||
periods=10, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this test actually testing? (which the first test does not)
can you update |
@jreback Please take a look. |
doc/source/whatsnew/v0.20.0.txt
Outdated
@@ -589,3 +589,5 @@ Bug Fixes | |||
- Bug in ``Series.replace`` and ``DataFrame.replace`` which failed on empty replacement dicts (:issue:`15289`) | |||
- Bug in ``pd.melt()`` where passing a tuple value for ``value_vars`` caused a ``TypeError`` (:issue:`15348`) | |||
- Bug in ``.eval()`` which caused multiline evals to fail with local variables not on the first line (:issue:`15342`) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move into a blank space above so that you don't get conflicts.
with converting a datetime64 index
from pandas.tests.frame.common import TestData | ||
|
||
|
||
class TestDataFrameConvertTo(tm.TestCase, TestData): | ||
_multiprocess_can_split_ = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove as we are using pytest now
periods=10, | ||
freq='S', | ||
tz='UTC')) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do something simpler and much more readable
result = df_utc.tz_convert("GMT").to_records()
expected = df_utc.to_records()
tm.assert_numpy_array_equal(result, expected)
|
||
def test_to_records_with_tz(self): | ||
# GH13937 | ||
df_utc = DataFrame({'datetime': date_range('2016-01-01', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you format this to make it easier to read
|
||
# test DataFrame.to_records() with timezone conversion to UTC | ||
|
||
df_gmt = DataFrame({'datetime': date_range('2016-01-01', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same with all of the following, make shorter / simpler / easier-to-read
@amolkahat thanks it looks a bit better. In general, please don't modify whitespace. You must have some editor / settings which are doing things. They are pretty distracting. |
you'll want to rebase on master as travis was having some issues |
Fix index for datetime64 conversion This PR fixes index for datetime64 conversion. Also, update tests - pandas/pandas/tests/tests_convert_to.py Fixes #13937
thanks! |
closes pandas-dev#13937 Author: Amol Kahat <[email protected]> Closes pandas-dev#14446 from amolkahat/bug_fixes and squashes the following commits: 3806983 [Amol Kahat] Modify test cases.
@jreback Thanks. |
@amolkahat for chat, there is the gitter chat (https://gitter.im/pydata/pandas), there is also the pandas-dev mailing list (https://mail.python.org/mailman/listinfo/pandas-dev), but most of the discussion happens on github |
@jorisvandenbossche Thank you. 😃 |
closes pandas-dev#13937 Author: Amol Kahat <[email protected]> Closes pandas-dev#14446 from amolkahat/bug_fixes and squashes the following commits: 3806983 [Amol Kahat] Modify test cases.
git diff upstream/master | flake8 --diff
Fix index for datetime64 conversion
This PR fixes index for datetime64 conversion. Also,
update tests - pandas/pandas/tests/tests_convert_to.py
Fixes #13937