-
Notifications
You must be signed in to change notification settings - Fork 1.1k
add test for hour_angle, vectorize #599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 2 commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
ace0a17
add test for hour_angle, vectorize
mikofski 433175d
BUG: use np.int64 works better for older numpy version than python int
mikofski 2908875
fix hanging indent
mikofski 7369f3f
BUG: replace utcoffset() with reliable, efficient approach ...
mikofski 273106c
BUG: combine arithmetic to make calculation more efficient
mikofski d89e725
stickler
mikofski 9622661
remove try-except, doesn't work anyway, wait for pandas >=0.15.0
mikofski File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1327,8 +1327,8 @@ def hour_angle(times, longitude, equation_of_time): | |
equation_of_time_Spencer71 | ||
equation_of_time_pvcdrom | ||
""" | ||
hours = np.array([(t - t.tz.localize( | ||
dt.datetime(t.year, t.month, t.day) | ||
)).total_seconds() / 3600. for t in times]) | ||
timezone = times.tz.utcoffset(times).total_seconds() / 3600. | ||
tz_info = times.tz | ||
timezone = tz_info.utcoffset(times).total_seconds() / 3600. | ||
hours = (times.astype(np.int64) - times.normalize().astype(np.int64)) / ( | ||
3600. * 1.e9) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. E126 continuation line over-indented for hanging indent |
||
return 15. * (hours - 12. - timezone) + longitude + equation_of_time / 4. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this the expected behavior?
I am surprised because 1. I expected -7 and 2.
times
is a vector so I expected to get a result for each time (offset could change for DST).Is this preferable?
The
np.array
wrapper ensures a known type instead of a version-dependent Index/array type.Note that
tz_localize(None)
will always work if we bump min pandas to 0.15, as discussed elsewhere.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it's not what I would expect, I took that approach from this SO answer.
TL;DR: let's just use your approach instead
details
Something very weird is going on with this particular
America/Phoenix
timezone. PyTz is picking a historical timezone called LMT that was discontinued in 1883 because it is listed first in the transition info list.The source for
DstTzInfo
says that_utcoffset
is returned forutcoffset(dt)
if localized already, otherwise localize first then return the_utcoffset
of the timezone.And if not localized, then
_utcoffset
is set to whatever is the first tz info in the transition info table. In this case that's "LMT" which was UTC -7:28:18 until 1883.If the datetime is localized, then PyTz does some smart stuff to figure out which transition info to use, based on the values in
_utc_transition_times
and returns the corresponding transition info and timezone info, and replacestzinfo
in the datetime.Since our datetime is 2018-10-05T12:58:34, that's the last index, and we get the the most recent Phoenix timezone from the transition info table.
which is the key using in the timezone info table and returns the correct timezone
When pandas makes a datetime index, I guess it doesn't care about
tzinfo
but it does do this withTimestamps
which are the items in the datetime index.DatetimeIndex
has wrong timezoneTimestamp
has correct timezoneAlso if
to_pydatetime()
is called it uses the correct timezone as well.Anyway, not sure if this is a bug in pandas or not, but who cares? Now I have to go around and fix this everywhere I've used☹️
utcoffset()
, bummer.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: it's only used 2 places:
So
utcoffset()
is also a Python builtin which is overloaded by PyTz. The difference is that ifutcoffset
is called on a Python datetime then it will return the utcoffset of it's tz info called on itself. So in our example above we could have called:Also FYI: there is an inefficient loop in
rise_set_transit_ephem
which uses slowto_pydatetime()
andutcoffset()
, but it's probably addressed in #588?Other than that I think there's only 2 other instances of
utcoffset
this one in hour_angle and one in #583. I will fix both. Thanks for catching this!There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: this is also an example of why we should never replace tzinfo manually, but always use the
localize()
method so that the correct timezone from the transition info table is used.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking into it. I've seen this kind of funny behavior with the AZ timezones before but it didn't occur to me that it could be a problem in this function. Some other common timezones have similar behavior (though I don't remember which).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, I don't think it's a problem with the function
utcoffset()
. Honestly, I think this was my mistake for usingutcoffset()
with a sequence of times likepandas.DatetimeIndex
, because there could potentially be a different timezone for each item in the sequence, so it's ambiguous which timezone should be used. Therefore, I think the problem is that thetz
(akatzinfo
) that's set in theDatetimeIndex
is unreliable and ambiguous, or the problem could be withutcoffset
which should be overloaded to return a list of timezones localized to each item in the sequence.