Skip to content

Commit 485dcfc

Browse files
Merge branch 'pandas-dev:main' into raise-on-parse-int-overflow
2 parents 2c16f74 + 64e7859 commit 485dcfc

31 files changed

+120
-38
lines changed

.circleci/config.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ jobs:
44
test-arm:
55
machine:
66
image: ubuntu-2004:202101-01
7-
resource_class: arm.medium
7+
resource_class: arm.large
88
environment:
99
ENV_FILE: ci/deps/circle-38-arm64.yaml
1010
PYTEST_WORKERS: auto

ci/deps/actions-310.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ dependencies:
1919
- pytz
2020

2121
# optional dependencies
22+
- aiobotocore<2.0.0
2223
- beautifulsoup4
2324
- blosc
2425
- bottleneck
@@ -43,7 +44,7 @@ dependencies:
4344
- pyreadstat
4445
- python-snappy
4546
- pyxlsb
46-
- s3fs
47+
- s3fs>=2021.05.0
4748
- scipy
4849
- sqlalchemy
4950
- tabulate

ci/deps/actions-38-downstream_compat.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ dependencies:
4444
- pytables
4545
- python-snappy
4646
- pyxlsb
47-
- s3fs
47+
- s3fs>=2021.05.0
4848
- scipy
4949
- sqlalchemy
5050
- tabulate

ci/deps/actions-38.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ dependencies:
1919
- pytz
2020

2121
# optional dependencies
22+
- aiobotocore<2.0.0
2223
- beautifulsoup4
2324
- blosc
2425
- bottleneck
@@ -43,7 +44,7 @@ dependencies:
4344
- pytables
4445
- python-snappy
4546
- pyxlsb
46-
- s3fs
47+
- s3fs>=2021.05.0
4748
- scipy
4849
- sqlalchemy
4950
- tabulate

ci/deps/actions-39.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ dependencies:
1919
- pytz
2020

2121
# optional dependencies
22+
- aiobotocore<2.0.0
2223
- beautifulsoup4
2324
- blosc
2425
- bottleneck
@@ -43,7 +44,7 @@ dependencies:
4344
- pytables
4445
- python-snappy
4546
- pyxlsb
46-
- s3fs
47+
- s3fs>=2021.05.0
4748
- scipy
4849
- sqlalchemy
4950
- tabulate

ci/deps/circle-38-arm64.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ dependencies:
1919
- pytz
2020

2121
# optional dependencies
22+
- aiobotocore<2.0.0
2223
- beautifulsoup4
2324
- blosc
2425
- bottleneck
@@ -44,7 +45,7 @@ dependencies:
4445
- pytables
4546
- python-snappy
4647
- pyxlsb
47-
- s3fs
48+
- s3fs>=2021.05.0
4849
- scipy
4950
- sqlalchemy
5051
- tabulate

doc/source/development/contributing_codebase.rst

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -265,7 +265,11 @@ pandas uses `mypy <http://mypy-lang.org>`_ and `pyright <https://github.com/micr
265265

266266
.. code-block:: shell
267267
268-
pre-commit run --hook-stage manual --all-files
268+
# the following might fail if the installed pandas version does not correspond to your local git version
269+
pre-commit run --hook-stage manual --all-files
270+
271+
# if the above fails due to stubtest
272+
SKIP=stubtest pre-commit run --hook-stage manual --all-files
269273
270274
in your activated python environment. A recent version of ``numpy`` (>=1.22.0) is required for type validation.
271275

doc/source/getting_started/intro_tutorials/05_add_columns.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,8 +30,8 @@
3030
</ul>
3131
</div>
3232

33-
How to create new columns derived from existing columns?
34-
--------------------------------------------------------
33+
How to create new columns derived from existing columns
34+
-------------------------------------------------------
3535

3636
.. image:: ../../_static/schemas/05_newcolumn_1.svg
3737
:align: center

doc/source/getting_started/intro_tutorials/06_calculate_statistics.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,8 +30,8 @@
3030
</ul>
3131
</div>
3232

33-
How to calculate summary statistics?
34-
------------------------------------
33+
How to calculate summary statistics
34+
-----------------------------------
3535

3636
Aggregating statistics
3737
~~~~~~~~~~~~~~~~~~~~~~

doc/source/getting_started/intro_tutorials/07_reshape_table_layout.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -84,8 +84,8 @@ measurement.
8484
</ul>
8585
</div>
8686

87-
How to reshape the layout of tables?
88-
------------------------------------
87+
How to reshape the layout of tables
88+
-----------------------------------
8989

9090
Sort table rows
9191
~~~~~~~~~~~~~~~

doc/source/getting_started/intro_tutorials/08_combine_dataframes.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -88,8 +88,8 @@ Westminster* in respectively Paris, Antwerp and London.
8888
</div>
8989

9090

91-
How to combine data from multiple tables?
92-
-----------------------------------------
91+
How to combine data from multiple tables
92+
----------------------------------------
9393

9494
Concatenating objects
9595
~~~~~~~~~~~~~~~~~~~~~

doc/source/getting_started/intro_tutorials/09_timeseries.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -55,8 +55,8 @@ Westminster* in respectively Paris, Antwerp and London.
5555
</ul>
5656
</div>
5757

58-
How to handle time series data with ease?
59-
-----------------------------------------
58+
How to handle time series data with ease
59+
----------------------------------------
6060

6161
.. _10min_tut_09_timeseries.properties:
6262

doc/source/getting_started/intro_tutorials/10_text_data.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,8 +29,8 @@
2929
</ul>
3030
</div>
3131

32-
How to manipulate textual data?
33-
-------------------------------
32+
How to manipulate textual data
33+
------------------------------
3434

3535
.. raw:: html
3636

doc/source/index.rst.template

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ pandas documentation
1010

1111
**Date**: |today| **Version**: |version|
1212

13-
**Download documentation**: `PDF Version <pandas.pdf>`__ | `Zipped HTML <pandas.zip>`__
13+
**Download documentation**: `Zipped HTML <pandas.zip>`__
1414

1515
**Previous versions**: Documentation of previous pandas versions is available at
1616
`pandas.pydata.org <https://pandas.pydata.org/>`__.

doc/source/whatsnew/v1.4.4.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ Fixed regressions
2626
- Fixed regression in :meth:`DataFrame.loc` setting a length-1 array like value to a single value in the DataFrame (:issue:`46268`)
2727
- Fixed regression when slicing with :meth:`DataFrame.loc` with :class:`DateOffset`-index (:issue:`46671`)
2828
- Fixed regression in setting ``None`` or non-string value into a ``string``-dtype Series using a mask (:issue:`47628`)
29+
- Fixed regression in updating a DataFrame column through Series ``__setitem__`` (using chained assignment) not updating column values inplace and using too much memory (:issue:`47172`)
2930
- Fixed regression in :meth:`DataFrame.select_dtypes` returning a view on the original DataFrame (:issue:`48090`)
3031
- Fixed regression using custom Index subclasses (for example, used in xarray) with :meth:`~DataFrame.reset_index` or :meth:`Index.insert` (:issue:`47071`)
3132
- Fixed regression in :meth:`DatetimeIndex.intersection` when the :class:`DatetimeIndex` has dates crossing daylight savings time (:issue:`46702`)

environment.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ dependencies:
4545
- pytables
4646
- python-snappy
4747
- pyxlsb
48-
- s3fs
48+
- s3fs>=2021.05.0
4949
- scipy
5050
- sqlalchemy
5151
- tabulate

pandas/core/reshape/merge.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -242,7 +242,7 @@ def merge_ordered(
242242
Returns
243243
-------
244244
DataFrame
245-
The merged DataFrame output type will the be same as
245+
The merged DataFrame output type will be the same as
246246
'left', if it is a subclass of DataFrame.
247247
248248
See Also

pandas/core/series.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1169,7 +1169,7 @@ def __setitem__(self, key, value) -> None:
11691169
self._set_with(key, value)
11701170

11711171
if cacher_needs_updating:
1172-
self._maybe_update_cacher()
1172+
self._maybe_update_cacher(inplace=True)
11731173

11741174
def _set_with_engine(self, key, value) -> None:
11751175
loc = self.index.get_loc(key)

pandas/tests/arrays/floating/test_astype.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -116,3 +116,13 @@ def test_astype_object(dtype):
116116
# check exact element types
117117
assert isinstance(result[0], float)
118118
assert result[1] is pd.NA
119+
120+
121+
def test_Float64_conversion():
122+
# GH#40729
123+
testseries = pd.Series(["1", "2", "3", "4"], dtype="object")
124+
result = testseries.astype(pd.Float64Dtype())
125+
126+
expected = pd.Series([1.0, 2.0, 3.0, 4.0], dtype=pd.Float64Dtype())
127+
128+
tm.assert_series_equal(result, expected)

pandas/tests/arrays/test_timedeltas.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,13 @@ def test_total_seconds(self, unit, tda):
6767
expected = tda_nano.total_seconds()
6868
tm.assert_numpy_array_equal(result, expected)
6969

70+
def test_timedelta_array_total_seconds(self):
71+
# GH34290
72+
expected = Timedelta("2 min").total_seconds()
73+
74+
result = pd.array([Timedelta("2 min")]).total_seconds()[0]
75+
assert result == expected
76+
7077
@pytest.mark.parametrize(
7178
"nat", [np.datetime64("NaT", "ns"), np.datetime64("NaT", "us")]
7279
)

pandas/tests/extension/test_floating.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -211,5 +211,6 @@ class TestParsing(base.BaseParsingTests):
211211
pass
212212

213213

214+
@pytest.mark.filterwarnings("ignore:overflow encountered in reduce:RuntimeWarning")
214215
class Test2DCompat(base.Dim2CompatTests):
215216
pass

pandas/tests/frame/indexing/test_setitem.py

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1235,3 +1235,21 @@ def test_setitem_not_operating_inplace(self, value, set_value, indexer):
12351235
view = df[:]
12361236
df[indexer] = set_value
12371237
tm.assert_frame_equal(view, expected)
1238+
1239+
@td.skip_array_manager_invalid_test
1240+
def test_setitem_column_update_inplace(self, using_copy_on_write):
1241+
# https://github.com/pandas-dev/pandas/issues/47172
1242+
1243+
labels = [f"c{i}" for i in range(10)]
1244+
df = DataFrame({col: np.zeros(len(labels)) for col in labels}, index=labels)
1245+
values = df._mgr.blocks[0].values
1246+
1247+
for label in df.columns:
1248+
df[label][label] = 1
1249+
1250+
if not using_copy_on_write:
1251+
# diagonal values all updated
1252+
assert np.all(values[np.arange(10), np.arange(10)] == 1)
1253+
else:
1254+
# original dataframe not updated
1255+
assert np.all(values[np.arange(10), np.arange(10)] == 0)

pandas/tests/frame/methods/test_combine_first.py

Lines changed: 20 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,9 @@
33
import numpy as np
44
import pytest
55

6+
from pandas.compat import pa_version_under7p0
7+
from pandas.errors import PerformanceWarning
8+
69
from pandas.core.dtypes.cast import (
710
find_common_type,
811
is_dtype_equal,
@@ -387,12 +390,24 @@ def test_combine_first_string_dtype_only_na(self, nullable_string_dtype):
387390
{"a": ["962", "85"], "b": [pd.NA] * 2}, dtype=nullable_string_dtype
388391
)
389392
df2 = DataFrame({"a": ["85"], "b": [pd.NA]}, dtype=nullable_string_dtype)
390-
df = df.set_index(["a", "b"], copy=False)
391-
df2 = df2.set_index(["a", "b"], copy=False)
393+
with tm.maybe_produces_warning(
394+
PerformanceWarning,
395+
pa_version_under7p0 and nullable_string_dtype == "string[pyarrow]",
396+
):
397+
df = df.set_index(["a", "b"], copy=False)
398+
with tm.maybe_produces_warning(
399+
PerformanceWarning,
400+
pa_version_under7p0 and nullable_string_dtype == "string[pyarrow]",
401+
):
402+
df2 = df2.set_index(["a", "b"], copy=False)
392403
result = df.combine_first(df2)
393-
expected = DataFrame(
394-
{"a": ["962", "85"], "b": [pd.NA] * 2}, dtype=nullable_string_dtype
395-
).set_index(["a", "b"])
404+
with tm.maybe_produces_warning(
405+
PerformanceWarning,
406+
pa_version_under7p0 and nullable_string_dtype == "string[pyarrow]",
407+
):
408+
expected = DataFrame(
409+
{"a": ["962", "85"], "b": [pd.NA] * 2}, dtype=nullable_string_dtype
410+
).set_index(["a", "b"])
396411
tm.assert_frame_equal(result, expected)
397412

398413

pandas/tests/frame/methods/test_quantile.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -751,6 +751,9 @@ def test_quantile_empty_no_rows_ints(self, interp_method):
751751
exp = Series([np.nan, np.nan], index=["a", "b"], name=0.5)
752752
tm.assert_series_equal(res, exp)
753753

754+
@pytest.mark.filterwarnings(
755+
"ignore:The behavior of DatetimeArray._from_sequence:FutureWarning"
756+
)
754757
def test_quantile_empty_no_rows_dt64(self, interp_method):
755758
interpolation, method = interp_method
756759
# datetimes

pandas/tests/indexes/multi/test_constructors.py

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,8 @@
77
import numpy as np
88
import pytest
99

10+
from pandas.compat import pa_version_under1p01
11+
1012
from pandas.core.dtypes.cast import construct_1d_object_array_from_listlike
1113

1214
import pandas as pd
@@ -648,6 +650,28 @@ def test_from_frame():
648650
tm.assert_index_equal(expected, result)
649651

650652

653+
@pytest.mark.skipif(pa_version_under1p01, reason="Import Problem")
654+
def test_from_frame_missing_values_multiIndex():
655+
# GH 39984
656+
import pyarrow as pa
657+
658+
df = pd.DataFrame(
659+
{
660+
"a": Series([1, 2, None], dtype="Int64"),
661+
"b": pd.Float64Dtype().__from_arrow__(pa.array([0.2, np.nan, None])),
662+
}
663+
)
664+
multi_indexed = MultiIndex.from_frame(df)
665+
expected = MultiIndex.from_arrays(
666+
[
667+
Series([1, 2, None]).astype("Int64"),
668+
pd.Float64Dtype().__from_arrow__(pa.array([0.2, np.nan, None])),
669+
],
670+
names=["a", "b"],
671+
)
672+
tm.assert_index_equal(multi_indexed, expected)
673+
674+
651675
@pytest.mark.parametrize(
652676
"non_frame",
653677
[

pandas/tests/scalar/timedelta/test_timedelta.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,6 @@
1414
iNaT,
1515
)
1616
from pandas._libs.tslibs.dtypes import NpyDatetimeUnit
17-
from pandas.compat import IS64
1817
from pandas.errors import OutOfBoundsTimedelta
1918

2019
import pandas as pd
@@ -691,7 +690,7 @@ def test_round_implementation_bounds(self):
691690
with pytest.raises(OverflowError, match=msg):
692691
Timedelta.max.ceil("s")
693692

694-
@pytest.mark.xfail(not IS64, reason="Failing on 32 bit build", strict=False)
693+
@pytest.mark.xfail(reason="Failing on builds", strict=False)
695694
@given(val=st.integers(min_value=iNaT + 1, max_value=lib.i8max))
696695
@pytest.mark.parametrize(
697696
"method", [Timedelta.round, Timedelta.floor, Timedelta.ceil]

pandas/tests/scalar/timestamp/test_unary_ops.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,6 @@
2121
)
2222
from pandas._libs.tslibs.dtypes import NpyDatetimeUnit
2323
from pandas._libs.tslibs.period import INVALID_FREQ_ERR_MSG
24-
from pandas.compat import IS64
2524
import pandas.util._test_decorators as td
2625

2726
import pandas._testing as tm
@@ -298,7 +297,7 @@ def test_round_implementation_bounds(self):
298297
with pytest.raises(OverflowError, match=msg):
299298
Timestamp.max.ceil("s")
300299

301-
@pytest.mark.xfail(not IS64, reason="Failing on 32 bit build", strict=False)
300+
@pytest.mark.xfail(reason="Failing on builds", strict=False)
302301
@given(val=st.integers(iNaT + 1, lib.i8max))
303302
@pytest.mark.parametrize(
304303
"method", [Timestamp.round, Timestamp.floor, Timestamp.ceil]

pandas/tests/test_expressions.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -173,7 +173,7 @@ def testit():
173173

174174
with warnings.catch_warnings():
175175
# array has 0s
176-
msg = "invalid value encountered in true_divide"
176+
msg = "invalid value encountered in divide|true_divide"
177177
warnings.filterwarnings("ignore", msg, RuntimeWarning)
178178
result = expr.evaluate(op, left, left, use_numexpr=True)
179179
expected = expr.evaluate(op, left, left, use_numexpr=False)

0 commit comments

Comments
 (0)