Skip to content

fix: cleaning up tests for release #4977

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Feb 7, 2025
Merged

fix: cleaning up tests for release #4977

merged 8 commits into from
Feb 7, 2025

Conversation

gvwilson
Copy link
Contributor

@gvwilson gvwilson commented Jan 22, 2025

  1. Modify tests in plotly/tests/test_optional/test_px/test_px_*.py
    to expect failure when constructing datetime objects with timezones
    using PyArrow (which appears to need a different format for its
    constructor - we can clean this up later).
  2. Convert plain string to raw string in test_enumerated_validator.py
    to avoid complaint about unrecognized escape sequence \d.
  3. Do not delete the "template" key from JSON in JSON I/O test.
  4. Convert from scattermapbox to scattermap in test_skipped_b64_keys.py.
  5. Remove orca-related tests.
  6. Optionally import vaex (not currently available for Python 3.11) and
    skip some tests if it is not available.
  7. Modify test_requirements/requirements_311_optional.txt to include all
    the extra packages needed for testing with Python 3.11.
  8. Add plotly-geo to test_requirements/requirements_312_optional.txt.
  9. Add r prefix to regular expression strings to prevent complaints
    about \( and \..

@gvwilson gvwilson added P1 needed for current cycle fix fixes something broken testing automated tests labels Jan 22, 2025
@gvwilson gvwilson self-assigned this Jan 22, 2025
@gvwilson gvwilson requested a review from LiamConnors January 22, 2025 18:26
## orca dependencies ##
requests
psutil
polars
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't exactly a problem with your PR, but I think it would be helpful to clarify the difference between this file (packages/python/plotly/requires-optional.txt) and the files in packages/python/plotly/test_requirements. I think it could be good to consolidate them or recommend that developers use the test requirements file that corresponds to their python version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 thank you - I've moved content from requires-optional.txt to test_requirements/requirements_311_optional.txt. I'll see what I need to do with Python 3.12 as well.

@gvwilson gvwilson force-pushed the updates-to-tests branch 2 times, most recently from edabade to affbbe2 Compare January 23, 2025 14:27
# FIXME: PyArrow requires a different format for datetime constructors with timezones
from .conftest import pyarrow_table_constructor
if constructor is pyarrow_table_constructor:
request.applymarker(pytest.mark.xfail)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gvwilson Instead of marking this as xfail for pyarrow, could we just pass the correct date format to the constructor? Or is that difficult?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've spent half an hour on this without luck, and have reached out to folks for help - if you're OK merging this with the FIXME I can patch it if/when I hear back.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi both!

I tried running this locally, and it passes for me without the xfail

Is there a CI run I could look at the logs of where it failed?

(side note - if you use xfail, I'd suggest making it strict, so that if an xfailed test unexpectedly passes, you get a test failure: you can do this by putting xfail_strict = true in the pytest section of pyproject.toml)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @MarcoGorelli - the error I get is pasted below, and partial output of uv pip list is below that - is this a version issue?

$ pytest -k test_date_in_hover plotly/tests/test_optional/test_px/test_px_hover.py
============================= test session starts ==============================
platform darwin -- Python 3.12.5, pytest-7.4.4, pluggy-1.5.0
rootdir: /Users/gvwilson/plotly.py/packages/python/plotly
configfile: pytest.ini
plugins: typeguard-4.4.1, anyio-4.8.0
collected 22 items / 17 deselected / 5 selected

plotly/tests/test_optional/test_px/test_px_hover.py .F...                [100%]

=================================== FAILURES ===================================
________________ test_date_in_hover[pyarrow_table_constructor] _________________

request = <FixtureRequest for <Function test_date_in_hover[pyarrow_table_constructor]>>
constructor = <function pyarrow_table_constructor at 0x106f69260>

    def test_date_in_hover(request, constructor):
        # FIXME: PyArrow requires a different format for datetime constructors with timezones
        # from .conftest import pyarrow_table_constructor
        # if constructor is pyarrow_table_constructor:
        #     request.applymarker(pytest.mark.xfail)
    
        df = nw.from_native(
            constructor({"date": ["2015-04-04 19:31:30+01:00"], "value": [3]})
>       ).with_columns(date=nw.col("date").str.to_datetime(format="%Y-%m-%d %H:%M:%S%z"))

plotly/tests/test_optional/test_px/test_px_hover.py:195: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../../.venv/lib/python3.12/site-packages/narwhals/dataframe.py:1802: in with_columns
    return super().with_columns(*exprs, **named_exprs)
../../../.venv/lib/python3.12/site-packages/narwhals/dataframe.py:138: in with_columns
    self._compliant_frame.with_columns(*compliant_exprs, **compliant_named_exprs),
../../../.venv/lib/python3.12/site-packages/narwhals/_arrow/dataframe.py:311: in with_columns
    new_columns: list[ArrowSeries] = evaluate_into_exprs(self, *exprs, **named_exprs)
../../../.venv/lib/python3.12/site-packages/narwhals/_expression_parsing.py:76: in evaluate_into_exprs
    evaluated_expr = evaluate_into_expr(df, expr)
../../../.venv/lib/python3.12/site-packages/narwhals/_expression_parsing.py:56: in evaluate_into_expr
    result = expr(df)
../../../.venv/lib/python3.12/site-packages/narwhals/_arrow/expr.py:66: in __call__
    return self._call(df)
../../../.venv/lib/python3.12/site-packages/narwhals/_expression_parsing.py:251: in <lambda>
    getattr(getattr(series, series_namespace), attr)(**kwargs)
../../../.venv/lib/python3.12/site-packages/narwhals/_arrow/series_str.py:80: in to_datetime
    pc.strptime(self._compliant_series._native_series, format=format, unit="us")
../../../.venv/lib/python3.12/site-packages/pyarrow/compute.py:264: in wrapper
    return func.call(args, options, memory_pool)
pyarrow/_compute.pyx:393: in pyarrow._compute.Function.call
    ???
pyarrow/error.pxi:155: in pyarrow.lib.pyarrow_internal_check_status
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   pyarrow.lib.ArrowInvalid: Failed to parse string: '2015-04-04 19:31:30+01:00' as a scalar of type timestamp[us]

pyarrow/error.pxi:92: ArrowInvalid
=============================== warnings summary ===============================
plotly/tests/test_optional/test_px/test_px_hover.py::test_date_in_hover[polars_eager_constructor]
plotly/tests/test_optional/test_px/test_px_hover.py::test_date_in_hover[pandas_constructor]
plotly/tests/test_optional/test_px/test_px_hover.py::test_date_in_hover[pandas_nullable_constructor]
plotly/tests/test_optional/test_px/test_px_hover.py::test_date_in_hover[pandas_pyarrow_constructor]
  /Users/gvwilson/plotly.py/packages/python/plotly/_plotly_utils/basevalidators.py:2596: DeprecationWarning:
  
  *scattermapbox* is deprecated! Use *scattermap* instead. Learn more at: https://plotly.com/python/mapbox-to-maplibre/

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED plotly/tests/test_optional/test_px/test_px_hover.py::test_date_in_hover[pyarrow_table_constructor]
============ 1 failed, 4 passed, 17 deselected, 4 warnings in 0.32s ============

environment:

arrow                     1.3.0
geopandas                 1.0.1
ipykernel                 6.29.5
ipython                   8.32.0
ipywidgets                8.1.5
jupyter                   1.1.1
jupyter-client            8.6.3
jupyter-console           6.6.3
jupyter-core              5.7.2
jupyter-events            0.12.0
jupyter-lsp               2.2.5
jupyter-server            2.15.0
jupyter-server-terminals  0.5.3
jupyterlab                4.3.5
jupyterlab-pygments       0.3.0
jupyterlab-server         2.27.3
jupyterlab-widgets        3.0.13
kaleido                   0.2.1
mock                      2.0.0
more-itertools            10.6.0
mypy-extensions           1.0.0
narwhals                  1.25.0
nbclient                  0.10.2
nbconvert                 7.16.6
nbformat                  5.10.4
numpy                     2.2.2
pandas                    2.2.3
pillow                    11.1.0
plotly-geo                1.0.0
polars                    1.21.0
pyarrow                   19.0.0
python-dateutil           2.9.0.post0
pytz                      2025.1
pyyaml                    6.0.2
pyzmq                     26.2.1
scikit-image              0.25.1
scipy                     1.15.1
setuptools                75.8.0
statsmodels               0.14.4
types-python-dateutil     2.9.0.20241206
typing-extensions         4.12.2
tzdata                    2025.1
xarray                    2025.1.2

Copy link
Contributor

@MarcoGorelli MarcoGorelli Feb 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks - 🤔 how confusing, I just tried this in a Kaggle notebook with the same versions of PyArrow and Narwhals that you have (1.19, 1.25) and it runs: https://www.kaggle.com/code/marcogorelli/pyarrow-parsing-repro?scriptVersionId=220968522

all I think to suggest is to check

import pyarrow
import pytest
import narwhals
print(pyarrow.__file__)
print(pytest.__file__)
print(narwhals.__file__)

and check that they're all from the same venv?

Does this test pass for your on the master branch?

Copy link
Contributor

@MarcoGorelli MarcoGorelli Feb 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking Narwhals out of the mix, does the following run for you?

import pyarrow as pa
import pyarrow.compute as pc

tbl = pa.table({"date": ["2015-04-04 19:31:30+01:00"]})
print(pc.strptime(tbl['date'], "%Y-%m-%d %H:%M:%S%z", unit='us'))

If not, I guess it might be an issue with PyArrow on Mac? If so, it may be worth filing an issue to PyArrow about this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I think possibly the answer here is that the UTC offset (%z) should NOT contain a colon according to Python and Pyarrow (C/C++) strptime format guidelines.

https://docs.python.org/3/library/datetime.html#strftime-strptime-behavior
https://www.ibm.com/docs/en/i/7.3?topic=functions-strptime-convert-string-datetime

Apparently Python and the other libraries just accept a colon anyway, but Pyarrow refuses. The test passes for me locally if we remove the colon from the time string, so that's my suggestion. It's weird that it would be platform-dependent, though.

However I don't know what guarantees Plotly makes about what datetime string formats we support -- need to check that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll be damned - thank you, that does fix it. Pushing now...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's wild, thanks

I just checked with a colleague who uses Mac, and he also gets this error

seems like PyArrow's datetime parsing really is platform-dependent 🤯

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find @emilykl! Oh datetimes, the gift that keeps on giving. Frustrating that ISO-8601 allows either with or without the colon and doesn't seem to take any position on which is preferred. FWIW plotly.js tries to be as flexible as possible, and allows timezone specifiers with or without the colon (and ignores them entirely, but that's another issue entirely) but that doesn't necessarily mean we need to allow them from Python. Heck, even stdlib Python doesn't claim to be consistent... in the reference you link it states:

The full set of format codes supported varies across platforms, because Python calls the platform C library’s strftime() function, and platform variations are common. To see the full set of format codes supported on your platform, consult the strftime(3) documentation. There are also differences between platforms in handling of unsupported format specifiers.

Still they list a new directive specifically for timezones with colons:

Added in version 3.12: %:z was added.

Which oddly (on my mac at least) only works for strftime, it's disallowed for strptime, but %z in strptime accepts optional colons

After experimentation, leaving it as `xfail` for now; will file an
issue and resurrect after merging the rest of this work.
Copy link
Contributor

@emilykl emilykl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gvwilson Looks good to me! 🚀

@gvwilson gvwilson merged commit 19b611d into master Feb 7, 2025
5 checks passed
@gvwilson gvwilson deleted the updates-to-tests branch February 13, 2025 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix fixes something broken P1 needed for current cycle testing automated tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants