Skip to content

Commit acc0370

Browse files
authored
Merge branch 'main' into #57512-bad-datetime-str-conversion-in-series-ctor
2 parents 723d09b + 73fd026 commit acc0370

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

63 files changed

+470
-661
lines changed

doc/source/whatsnew/v2.2.2.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ Fixed regressions
1515
~~~~~~~~~~~~~~~~~
1616
- :meth:`DataFrame.__dataframe__` was producing incorrect data buffers when the a column's type was a pandas nullable on with missing values (:issue:`56702`)
1717
- :meth:`DataFrame.__dataframe__` was producing incorrect data buffers when the a column's type was a pyarrow nullable on with missing values (:issue:`57664`)
18+
- Avoid issuing a spurious ``DeprecationWarning`` when a custom :class:`DataFrame` or :class:`Series` subclass method is called (:issue:`57553`)
1819
- Fixed regression in precision of :func:`to_datetime` with string and ``unit`` input (:issue:`57051`)
1920

2021
.. ---------------------------------------------------------------------------
@@ -25,6 +26,7 @@ Bug fixes
2526
- :meth:`DataFrame.__dataframe__` was producing incorrect data buffers when the column's type was nullable boolean (:issue:`55332`)
2627
- :meth:`DataFrame.__dataframe__` was showing bytemask instead of bitmask for ``'string[pyarrow]'`` validity buffer (:issue:`57762`)
2728
- :meth:`DataFrame.__dataframe__` was showing non-null validity buffer (instead of ``None``) ``'string[pyarrow]'`` without missing values (:issue:`57761`)
29+
- :meth:`DataFrame.to_sql` was failing to find the right table when using the schema argument (:issue:`57539`)
2830

2931
.. ---------------------------------------------------------------------------
3032
.. _whatsnew_222.other:

doc/source/whatsnew/v3.0.0.rst

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -200,25 +200,34 @@ Other Deprecations
200200
Removal of prior version deprecations/changes
201201
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
202202
- :class:`.DataFrameGroupBy.idxmin`, :class:`.DataFrameGroupBy.idxmax`, :class:`.SeriesGroupBy.idxmin`, and :class:`.SeriesGroupBy.idxmax` will now raise a ``ValueError`` when used with ``skipna=False`` and an NA value is encountered (:issue:`10694`)
203+
- :func:`concat` no longer ignores empty objects when determining output dtypes (:issue:`39122`)
203204
- :func:`read_excel`, :func:`read_json`, :func:`read_html`, and :func:`read_xml` no longer accept raw string or byte representation of the data. That type of data must be wrapped in a :py:class:`StringIO` or :py:class:`BytesIO` (:issue:`53767`)
204205
- :meth:`DataFrame.groupby` with ``as_index=False`` and aggregation methods will no longer exclude from the result the groupings that do not arise from the input (:issue:`49519`)
205206
- :meth:`Series.dt.to_pydatetime` now returns a :class:`Series` of :py:class:`datetime.datetime` objects (:issue:`52459`)
206207
- :meth:`SeriesGroupBy.agg` no longer pins the name of the group to the input passed to the provided ``func`` (:issue:`51703`)
207208
- All arguments except ``name`` in :meth:`Index.rename` are now keyword only (:issue:`56493`)
208209
- All arguments except the first ``path``-like argument in IO writers are now keyword only (:issue:`54229`)
210+
- Disallow non-standard (``np.ndarray``, :class:`Index`, :class:`ExtensionArray`, or :class:`Series`) to :func:`isin`, :func:`unique`, :func:`factorize` (:issue:`52986`)
211+
- Disallow passing a pandas type to :meth:`Index.view` (:issue:`55709`)
212+
- Disallow units other than "s", "ms", "us", "ns" for datetime64 and timedelta64 dtypes in :func:`array` (:issue:`53817`)
209213
- Removed "freq" keyword from :class:`PeriodArray` constructor, use "dtype" instead (:issue:`52462`)
214+
- Removed deprecated "method" and "limit" keywords from :meth:`Series.replace` and :meth:`DataFrame.replace` (:issue:`53492`)
215+
- Removed extension test classes ``BaseNoReduceTests``, ``BaseNumericReduceTests``, ``BaseBooleanReduceTests`` (:issue:`54663`)
210216
- Removed the "closed" and "normalize" keywords in :meth:`DatetimeIndex.__new__` (:issue:`52628`)
217+
- Stopped performing dtype inference with in :meth:`Index.insert` with object-dtype index; this often affects the index/columns that result when setting new entries into an empty :class:`Series` or :class:`DataFrame` (:issue:`51363`)
211218
- Removed the "closed" and "unit" keywords in :meth:`TimedeltaIndex.__new__` (:issue:`52628`, :issue:`55499`)
212219
- All arguments in :meth:`Index.sort_values` are now keyword only (:issue:`56493`)
213220
- All arguments in :meth:`Series.to_dict` are now keyword only (:issue:`56493`)
214221
- Changed the default value of ``observed`` in :meth:`DataFrame.groupby` and :meth:`Series.groupby` to ``True`` (:issue:`51811`)
215222
- Enforce deprecation in :func:`testing.assert_series_equal` and :func:`testing.assert_frame_equal` with object dtype and mismatched null-like values, which are now considered not-equal (:issue:`18463`)
223+
- Enforced deprecation ``all`` and ``any`` reductions with ``datetime64`` and :class:`DatetimeTZDtype` dtypes (:issue:`58029`)
216224
- Enforced deprecation disallowing parsing datetimes with mixed time zones unless user passes ``utc=True`` to :func:`to_datetime` (:issue:`57275`)
217225
- Enforced deprecation in :meth:`Series.value_counts` and :meth:`Index.value_counts` with object dtype performing dtype inference on the ``.index`` of the result (:issue:`56161`)
218226
- Enforced deprecation of :meth:`.DataFrameGroupBy.get_group` and :meth:`.SeriesGroupBy.get_group` allowing the ``name`` argument to be a non-tuple when grouping by a list of length 1 (:issue:`54155`)
219227
- Enforced deprecation of :meth:`Series.interpolate` and :meth:`DataFrame.interpolate` for object-dtype (:issue:`57820`)
220228
- Enforced deprecation of :meth:`offsets.Tick.delta`, use ``pd.Timedelta(obj)`` instead (:issue:`55498`)
221229
- Enforced deprecation of ``axis=None`` acting the same as ``axis=0`` in the DataFrame reductions ``sum``, ``prod``, ``std``, ``var``, and ``sem``, passing ``axis=None`` will now reduce over both axes; this is particularly the case when doing e.g. ``numpy.sum(df)`` (:issue:`21597`)
230+
- Enforced deprecation of non-standard (``np.ndarray``, :class:`ExtensionArray`, :class:`Index`, or :class:`Series`) argument to :func:`api.extensions.take` (:issue:`52981`)
222231
- Enforced deprecation of parsing system timezone strings to ``tzlocal``, which depended on system timezone, pass the 'tz' keyword instead (:issue:`50791`)
223232
- Enforced deprecation of passing a dictionary to :meth:`SeriesGroupBy.agg` (:issue:`52268`)
224233
- Enforced deprecation of string ``AS`` denoting frequency in :class:`YearBegin` and strings ``AS-DEC``, ``AS-JAN``, etc. denoting annual frequencies with various fiscal year starts (:issue:`57793`)
@@ -298,6 +307,7 @@ Performance improvements
298307
- Performance improvement in :meth:`DataFrameGroupBy.ffill`, :meth:`DataFrameGroupBy.bfill`, :meth:`SeriesGroupBy.ffill`, and :meth:`SeriesGroupBy.bfill` (:issue:`56902`)
299308
- Performance improvement in :meth:`Index.join` by propagating cached attributes in cases where the result matches one of the inputs (:issue:`57023`)
300309
- Performance improvement in :meth:`Index.take` when ``indices`` is a full range indexer from zero to length of index (:issue:`56806`)
310+
- Performance improvement in :meth:`Index.to_frame` returning a :class:`RangeIndex` columns of a :class:`Index` when possible. (:issue:`58018`)
301311
- Performance improvement in :meth:`MultiIndex.equals` for equal length indexes (:issue:`56990`)
302312
- Performance improvement in :meth:`RangeIndex.__getitem__` with a boolean mask or integers returning a :class:`RangeIndex` instead of a :class:`Index` when possible. (:issue:`57588`)
303313
- Performance improvement in :meth:`RangeIndex.append` when appending the same index (:issue:`57252`)
@@ -320,8 +330,10 @@ Bug fixes
320330
- Fixed bug in :class:`Series` constructor responsible for bad datetime to str dtype conversions in ``read_csv``. (:issue:`57512`)
321331
- Fixed bug in :class:`SparseDtype` for equal comparison with na fill value. (:issue:`54770`)
322332
- Fixed bug in :meth:`.DataFrameGroupBy.median` where nat values gave an incorrect result. (:issue:`57926`)
333+
- Fixed bug in :meth:`DataFrame.cumsum` which was raising ``IndexError`` if dtype is ``timedelta64[ns]`` (:issue:`57956`)
323334
- Fixed bug in :meth:`DataFrame.join` inconsistently setting result index name (:issue:`55815`)
324335
- Fixed bug in :meth:`DataFrame.to_string` that raised ``StopIteration`` with nested DataFrames. (:issue:`16098`)
336+
- Fixed bug in :meth:`DataFrame.transform` that was returning the wrong order unless the index was monotonically increasing. (:issue:`57069`)
325337
- Fixed bug in :meth:`DataFrame.update` bool dtype being converted to object (:issue:`55509`)
326338
- Fixed bug in :meth:`DataFrameGroupBy.apply` that was returning a completely empty DataFrame when all return values of ``func`` were ``None`` instead of returning an empty DataFrame with the original columns and dtypes. (:issue:`57775`)
327339
- Fixed bug in :meth:`Series.diff` allowing non-integer values for the ``periods`` argument. (:issue:`56607`)
@@ -398,7 +410,7 @@ Period
398410

399411
Plotting
400412
^^^^^^^^
401-
-
413+
- Bug in :meth:`.DataFrameGroupBy.boxplot` failed when there were multiple groupings (:issue:`14701`)
402414
-
403415

404416
Groupby/resample/rolling

pandas/conftest.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -150,7 +150,6 @@ def pytest_collection_modifyitems(items, config) -> None:
150150
("is_categorical_dtype", "is_categorical_dtype is deprecated"),
151151
("is_sparse", "is_sparse is deprecated"),
152152
("DataFrameGroupBy.fillna", "DataFrameGroupBy.fillna is deprecated"),
153-
("NDFrame.replace", "The 'method' keyword"),
154153
("NDFrame.replace", "Series.replace without 'value'"),
155154
("NDFrame.clip", "Downcasting behavior in Series and DataFrame methods"),
156155
("Series.idxmin", "The behavior of Series.idxmin"),

pandas/core/algorithms.py

Lines changed: 16 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,6 @@
4343
ensure_float64,
4444
ensure_object,
4545
ensure_platform_int,
46-
is_array_like,
4746
is_bool_dtype,
4847
is_complex_dtype,
4948
is_dict_like,
@@ -227,12 +226,9 @@ def _ensure_arraylike(values, func_name: str) -> ArrayLike:
227226
# GH#52986
228227
if func_name != "isin-targets":
229228
# Make an exception for the comps argument in isin.
230-
warnings.warn(
231-
f"{func_name} with argument that is not not a Series, Index, "
232-
"ExtensionArray, or np.ndarray is deprecated and will raise in a "
233-
"future version.",
234-
FutureWarning,
235-
stacklevel=find_stack_level(),
229+
raise TypeError(
230+
f"{func_name} requires a Series, Index, "
231+
f"ExtensionArray, or np.ndarray, got {type(values).__name__}."
236232
)
237233

238234
inferred = lib.infer_dtype(values, skipna=False)
@@ -1163,28 +1159,30 @@ def take(
11631159
"""
11641160
if not isinstance(arr, (np.ndarray, ABCExtensionArray, ABCIndex, ABCSeries)):
11651161
# GH#52981
1166-
warnings.warn(
1167-
"pd.api.extensions.take accepting non-standard inputs is deprecated "
1168-
"and will raise in a future version. Pass either a numpy.ndarray, "
1169-
"ExtensionArray, Index, or Series instead.",
1170-
FutureWarning,
1171-
stacklevel=find_stack_level(),
1162+
raise TypeError(
1163+
"pd.api.extensions.take requires a numpy.ndarray, "
1164+
f"ExtensionArray, Index, or Series, got {type(arr).__name__}."
11721165
)
11731166

1174-
if not is_array_like(arr):
1175-
arr = np.asarray(arr)
1176-
11771167
indices = ensure_platform_int(indices)
11781168

11791169
if allow_fill:
11801170
# Pandas style, -1 means NA
11811171
validate_indices(indices, arr.shape[axis])
1172+
# error: Argument 1 to "take_nd" has incompatible type
1173+
# "ndarray[Any, Any] | ExtensionArray | Index | Series"; expected
1174+
# "ndarray[Any, Any]"
11821175
result = take_nd(
1183-
arr, indices, axis=axis, allow_fill=True, fill_value=fill_value
1176+
arr, # type: ignore[arg-type]
1177+
indices,
1178+
axis=axis,
1179+
allow_fill=True,
1180+
fill_value=fill_value,
11841181
)
11851182
else:
11861183
# NumPy style
1187-
result = arr.take(indices, axis=axis)
1184+
# error: Unexpected keyword argument "axis" for "take" of "ExtensionArray"
1185+
result = arr.take(indices, axis=axis) # type: ignore[call-arg,assignment]
11881186
return result
11891187

11901188

pandas/core/apply.py

Lines changed: 4 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -564,22 +564,10 @@ def apply_str(self) -> DataFrame | Series:
564564
"axis" not in arg_names or func in ("corrwith", "skew")
565565
):
566566
raise ValueError(f"Operation {func} does not support axis=1")
567-
if "axis" in arg_names:
568-
if isinstance(obj, (SeriesGroupBy, DataFrameGroupBy)):
569-
# Try to avoid FutureWarning for deprecated axis keyword;
570-
# If self.axis matches the axis we would get by not passing
571-
# axis, we safely exclude the keyword.
572-
573-
default_axis = 0
574-
if func in ["idxmax", "idxmin"]:
575-
# DataFrameGroupBy.idxmax, idxmin axis defaults to self.axis,
576-
# whereas other axis keywords default to 0
577-
default_axis = self.obj.axis
578-
579-
if default_axis != self.axis:
580-
self.kwargs["axis"] = self.axis
581-
else:
582-
self.kwargs["axis"] = self.axis
567+
if "axis" in arg_names and not isinstance(
568+
obj, (SeriesGroupBy, DataFrameGroupBy)
569+
):
570+
self.kwargs["axis"] = self.axis
583571
return self._apply_str(obj, func, *self.args, **self.kwargs)
584572

585573
def apply_list_or_dict_like(self) -> DataFrame | Series:

pandas/core/array_algos/datetimelike_accumulations.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,8 @@ def _cum_func(
4949
if not skipna:
5050
mask = np.maximum.accumulate(mask)
5151

52-
result = func(y)
52+
# GH 57956
53+
result = func(y, axis=0)
5354
result[mask] = iNaT
5455

5556
if values.dtype.kind in "mM":

pandas/core/arrays/datetimelike.py

Lines changed: 4 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1661,16 +1661,8 @@ def _groupby_op(
16611661
dtype = self.dtype
16621662
if dtype.kind == "M":
16631663
# Adding/multiplying datetimes is not valid
1664-
if how in ["sum", "prod", "cumsum", "cumprod", "var", "skew"]:
1665-
raise TypeError(f"datetime64 type does not support {how} operations")
1666-
if how in ["any", "all"]:
1667-
# GH#34479
1668-
warnings.warn(
1669-
f"'{how}' with datetime64 dtypes is deprecated and will raise in a "
1670-
f"future version. Use (obj != pd.Timestamp(0)).{how}() instead.",
1671-
FutureWarning,
1672-
stacklevel=find_stack_level(),
1673-
)
1664+
if how in ["any", "all", "sum", "prod", "cumsum", "cumprod", "var", "skew"]:
1665+
raise TypeError(f"datetime64 type does not support operation: '{how}'")
16741666

16751667
elif isinstance(dtype, PeriodDtype):
16761668
# Adding/multiplying Periods is not valid
@@ -2217,11 +2209,11 @@ def ceil(
22172209
# Reductions
22182210

22192211
def any(self, *, axis: AxisInt | None = None, skipna: bool = True) -> bool:
2220-
# GH#34479 the nanops call will issue a FutureWarning for non-td64 dtype
2212+
# GH#34479 the nanops call will raise a TypeError for non-td64 dtype
22212213
return nanops.nanany(self._ndarray, axis=axis, skipna=skipna, mask=self.isna())
22222214

22232215
def all(self, *, axis: AxisInt | None = None, skipna: bool = True) -> bool:
2224-
# GH#34479 the nanops call will issue a FutureWarning for non-td64 dtype
2216+
# GH#34479 the nanops call will raise a TypeError for non-td64 dtype
22252217

22262218
return nanops.nanall(self._ndarray, axis=axis, skipna=skipna, mask=self.isna())
22272219

pandas/core/construction.py

Lines changed: 3 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,6 @@
1515
cast,
1616
overload,
1717
)
18-
import warnings
1918

2019
import numpy as np
2120
from numpy import ma
@@ -35,7 +34,6 @@
3534
DtypeObj,
3635
T,
3736
)
38-
from pandas.util._exceptions import find_stack_level
3937

4038
from pandas.core.dtypes.base import ExtensionDtype
4139
from pandas.core.dtypes.cast import (
@@ -373,13 +371,10 @@ def array(
373371
return TimedeltaArray._from_sequence(data, dtype=dtype, copy=copy)
374372

375373
elif lib.is_np_dtype(dtype, "mM"):
376-
warnings.warn(
374+
raise ValueError(
375+
# GH#53817
377376
r"datetime64 and timedelta64 dtype resolutions other than "
378-
r"'s', 'ms', 'us', and 'ns' are deprecated. "
379-
r"In future releases passing unsupported resolutions will "
380-
r"raise an exception.",
381-
FutureWarning,
382-
stacklevel=find_stack_level(),
377+
r"'s', 'ms', 'us', and 'ns' are no longer supported."
383378
)
384379

385380
return NumpyExtensionArray._from_sequence(data, dtype=dtype, copy=copy)

pandas/core/dtypes/concat.py

Lines changed: 0 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -8,12 +8,10 @@
88
TYPE_CHECKING,
99
cast,
1010
)
11-
import warnings
1211

1312
import numpy as np
1413

1514
from pandas._libs import lib
16-
from pandas.util._exceptions import find_stack_level
1715

1816
from pandas.core.dtypes.astype import astype_array
1917
from pandas.core.dtypes.cast import (
@@ -101,28 +99,10 @@ def concat_compat(
10199
# Creating an empty array directly is tempting, but the winnings would be
102100
# marginal given that it would still require shape & dtype calculation and
103101
# np.concatenate which has them both implemented is compiled.
104-
orig = to_concat
105102
non_empties = [x for x in to_concat if _is_nonempty(x, axis)]
106-
if non_empties and axis == 0 and not ea_compat_axis:
107-
# ea_compat_axis see GH#39574
108-
to_concat = non_empties
109103

110104
any_ea, kinds, target_dtype = _get_result_dtype(to_concat, non_empties)
111105

112-
if len(to_concat) < len(orig):
113-
_, _, alt_dtype = _get_result_dtype(orig, non_empties)
114-
if alt_dtype != target_dtype:
115-
# GH#39122
116-
warnings.warn(
117-
"The behavior of array concatenation with empty entries is "
118-
"deprecated. In a future version, this will no longer exclude "
119-
"empty items when determining the result dtype. "
120-
"To retain the old behavior, exclude the empty entries before "
121-
"the concat operation.",
122-
FutureWarning,
123-
stacklevel=find_stack_level(),
124-
)
125-
126106
if target_dtype is not None:
127107
to_concat = [astype_array(arr, target_dtype, copy=False) for arr in to_concat]
128108

pandas/core/frame.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6011,8 +6011,8 @@ def reset_index(
60116011
60126012
names : int, str or 1-dimensional list, default None
60136013
Using the given string, rename the DataFrame column which contains the
6014-
index data. If the DataFrame has a MultiIndex, this has to be a list or
6015-
tuple with length equal to the number of levels.
6014+
index data. If the DataFrame has a MultiIndex, this has to be a list
6015+
with length equal to the number of levels.
60166016
60176017
.. versionadded:: 1.5.0
60186018

0 commit comments

Comments
 (0)