Skip to content

BUG: Timestamp 'fold' argument ignored when tz is provided as string/name #55932

Open
@kohlerjl

Description

@kohlerjl

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
import dateutil.tz
import zoneinfo
    
utc0 = pd.Timestamp('2023-11-05T08:30:00Z')
utc1 = pd.Timestamp('2023-11-05T09:30:00Z')

tz = dateutil.tz.gettz('US/Pacific')
assert pd.Timestamp(year=2023, month=11, day=5, hour=1, minute=30, fold=0, tz=tz) == utc0
assert pd.Timestamp(year=2023, month=11, day=5, hour=1, minute=30, fold=1, tz=tz) == utc1

tz = zoneinfo.ZoneInfo('US/Pacific')
assert pd.Timestamp(year=2023, month=11, day=5, hour=1, minute=30, fold=0, tz=tz) == utc0
assert pd.Timestamp(year=2023, month=11, day=5, hour=1, minute=30, fold=1, tz=tz) == utc1

tz = 'US/Pacific'
assert pd.Timestamp(year=2023, month=11, day=5, hour=1, minute=30, fold=0, tz=tz) == utc0
assert pd.Timestamp(year=2023, month=11, day=5, hour=1, minute=30, fold=1, tz=tz) == utc1

Issue Description

The fold argument to the Timestamp constructor appears to be ignored when tz is provided as a string, but works as expected for the corresponding dateutil.tz or zoneinfo objects.

On the current development branch, I get an AmbiguousTimeError error on the last two asserts

---------------------------------------------------------------------------
AmbiguousTimeError                        Traceback (most recent call last)
Cell In[1], line 17
     14 assert pd.Timestamp(year=2023, month=11, day=5, hour=1, minute=30, fold=1, tz=tz) == utc1
     16 tz = 'US/Pacific'
---> 17 assert pd.Timestamp(year=2023, month=11, day=5, hour=1, minute=30, fold=0, tz=tz) == utc0
     18 assert pd.Timestamp(year=2023, month=11, day=5, hour=1, minute=30, fold=1, tz=tz) == utc1

File timestamps.pyx:1882, in pandas._libs.tslibs.timestamps.Timestamp.new()

File conversion.pyx:328, in pandas._libs.tslibs.conversion.convert_to_tsobject()

File conversion.pyx:399, in pandas._libs.tslibs.conversion.convert_datetime_to_tsobject()

File conversion.pyx:658, in pandas._libs.tslibs.conversion._localize_pydatetime()

File ~/venv/lib/python3.11/site-packages/pytz/tzinfo.py:366, in DstTzInfo.localize(self, dt, is_dst)
360 # If we get this far, we have multiple possible timezones - this
361 # is an ambiguous case occurring during the end-of-DST transition.
362
363 # If told to be strict, raise an exception since we have an
364 # ambiguous case
365 if is_dst is None:
--> 366 raise AmbiguousTimeError(dt)
368 # Filter out the possiblilities that don't match the requested
369 # is_dst
370 filtered_possible_loc_dt = [
371 p for p in possible_loc_dt if bool(p.tzinfo._dst) == is_dst
372 ]

AmbiguousTimeError: 2023-11-05 01:30:00

This behavior is at least better than the current release (2.1,2), which fails with an AssertionError because
pd.Timestamp(year=2023, month=11, day=5, hour=1, minute=30, fold=0, tz=tz) returns the incorrect timestamp Timestamp('2023-11-05 01:30:00-0800', tz='US/Pacific')

Expected Behavior

I would expect the behavior of interpreting ambiguous timestamps with 'fold' provided to be the same when the timezone is defined as a string (e.g. tz='US/Pacific') as when using the equivalent zoneinfo or dateutil.tz timezone. I noticed that the 'fold' argument is not permitted when using a pytz timezone, but at least in that case a descriptive error is provided.

Installed Versions

INSTALLED VERSIONS

commit : b2d9ec1
python : 3.11.5.final.0
python-bits : 64
OS : Linux
OS-release : 6.6.1-arch1-1
Version : #1 SMP PREEMPT_DYNAMIC Wed, 08 Nov 2023 16:05:38 +0000
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : en_US.utf8
LOCALE : en_US.UTF-8

pandas : 2.2.0.dev0+564.gb2d9ec17c5
numpy : 1.26.2
pytz : 2023.3.post1
dateutil : 2.8.2
setuptools : 65.5.0
pip : 23.2.1
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.17.2
pandas_datareader : None
bs4 : 4.12.2
bottleneck : None
dataframe-api-compat: None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : 3.8.1
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyreadstat : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : 1.11.3
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
zstandard : None
tzdata : 2023.3
qtpy : None
pyqt5 : None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugNeeds TriageIssue that has not been reviewed by a pandas team memberTimestamppd.Timestamp and associated methods

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions