Skip to content

Calendar utilities #5155

Closed
Closed
@aulemahal

Description

@aulemahal

Is your feature request related to a problem? Please describe.
Handling cftime and numpy time coordinates can sometimes be exhausting. Here I am thinking of the following common problems:

  1. Querying the calendar type from a time coordinate.
  2. Converting a dataset from a calendar type to another.
  3. Generating a time coordinate in the correct calendar.

Describe the solution you'd like

  1. ds.time.dt.calendar would be magic.
  2. xr.convert_calendar(ds, "new_cal") could be nice?
  3. xr.date_range(start, stop, calendar=cal), same as pandas' (see context below).

Describe alternatives you've considered
We have implemented all this in (xclim)[https://xclim.readthedocs.io/en/stable/api.html#calendar-handling-utilities] (and more). But it seems to make sense that some of the simplest things there could move to xarray? We had this discussion in xarray-contrib/cf-xarray#193 and suggestion was made to see what fits here before implementing this there.

Additional context
At xclim, to differentiate numpy datetime64 from cftime types, we call the former "default". This way a time coordinate using cftime's "proleptic_gregorian" calendar is distinct from one using numpy's datetime64.

  1. is easy (xclim function). If the datatype is numpy return "default", if cftime, look into the first non-null value and get the calendar.
  2. xclim function The calendar type of each time element is transformed to the new calendar. Our way is to drop any dates that do not exist in the new calendar (like Feb 29th when going to "noleap"). In the other direction, there is an option to either fill with some fill value of simply not include them. It can't be a DataArray method, but could be a Dataset one, or simply a top-level function. Related to Converting cftime.datetime objects to np.datetime64 values through astype #5107.

We also have an interp_calendar function that reinterps data on a yearly basis. This is a bit narrower, because it only makes sense on daily data (or coarser).

  1. With the definition of a "default" calendar, date_range and date_range_like simply chose between pd.date_range and xr.cftime_range according to the target calendar.

What do you think? I have time to move whatever code makes sense to move.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions