Skip to content

Commit c973bee

Browse files
Merge remote-tracking branch 'upstream/master' into timestamp
2 parents 0090a8e + 14a2da1 commit c973bee

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

50 files changed

+879
-419
lines changed

Makefile

-1
Original file line numberDiff line numberDiff line change
@@ -23,4 +23,3 @@ doc:
2323
cd doc; \
2424
python make.py clean; \
2525
python make.py html
26-
python make.py spellcheck

ci/code_checks.sh

+1-1
Original file line numberDiff line numberDiff line change
@@ -206,7 +206,7 @@ if [[ -z "$CHECK" || "$CHECK" == "doctests" ]]; then
206206

207207
MSG='Doctests frame.py' ; echo $MSG
208208
pytest -q --doctest-modules pandas/core/frame.py \
209-
-k"-axes -combine -itertuples -join -pivot_table -query -reindex -reindex_axis -round"
209+
-k" -itertuples -join -reindex -reindex_axis -round"
210210
RET=$(($RET + $?)) ; echo $MSG "DONE"
211211

212212
MSG='Doctests series.py' ; echo $MSG

doc/source/whatsnew/v0.24.1.rst

+36-47
Original file line numberDiff line numberDiff line change
@@ -13,65 +13,56 @@ Whats New in 0.24.1 (February XX, 2019)
1313
{{ header }}
1414

1515
These are the changes in pandas 0.24.1. See :ref:`release` for a full changelog
16-
including other versions of pandas.
16+
including other versions of pandas. See :ref:`whatsnew_0240` for the 0.24.0 changelog.
1717

18-
.. _whatsnew_0241.regressions:
19-
20-
Fixed Regressions
21-
^^^^^^^^^^^^^^^^^
18+
.. _whatsnew_0241.api:
2219

23-
- Bug in :meth:`DataFrame.itertuples` with ``records`` orient raising an ``AttributeError`` when the ``DataFrame`` contained more than 255 columns (:issue:`24939`)
24-
- Bug in :meth:`DataFrame.itertuples` orient converting integer column names to strings prepended with an underscore (:issue:`24940`)
25-
- Fixed regression in :func:`read_sql` when passing certain queries with MySQL/pymysql (:issue:`24988`).
26-
- Fixed regression in :class:`Index.intersection` incorrectly sorting the values by default (:issue:`24959`).
27-
- Fixed regression in :func:`merge` when merging an empty ``DataFrame`` with multiple timezone-aware columns on one of the timezone-aware columns (:issue:`25014`).
20+
API Changes
21+
~~~~~~~~~~~
2822

29-
.. _whatsnew_0241.enhancements:
23+
Changing the ``sort`` parameter for :class:`Index` set operations
24+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
3025

31-
Enhancements
32-
^^^^^^^^^^^^
26+
The default ``sort`` value for :meth:`Index.union` has changed from ``True`` to ``None`` (:issue:`24959`).
27+
The default *behavior*, however, remains the same: the result is sorted, unless
3328

29+
1. ``self`` and ``other`` are identical
30+
2. ``self`` or ``other`` is empty
31+
3. ``self`` or ``other`` contain values that can not be compared (a ``RuntimeWarning`` is raised).
3432

35-
.. _whatsnew_0241.bug_fixes:
36-
37-
Bug Fixes
38-
~~~~~~~~~
39-
- Bug in :meth: `Timestamp` supporting %z (:issue:`21257`).
40-
**Conversion**
41-
42-
-
43-
-
44-
-
33+
This change will allow ``sort=True`` to mean "always sort" in a future release.
4534

46-
**Indexing**
35+
The same change applies to :meth:`Index.difference` and :meth:`Index.symmetric_difference`, which
36+
would not sort the result when the values could not be compared.
4737

48-
-
49-
-
50-
-
38+
The `sort` option for :meth:`Index.intersection` has changed in three ways.
5139

52-
**I/O**
40+
1. The default has changed from ``True`` to ``False``, to restore the
41+
pandas 0.23.4 and earlier behavior of not sorting by default.
42+
2. The behavior of ``sort=True`` can now be obtained with ``sort=None``.
43+
This will sort the result only if the values in ``self`` and ``other``
44+
are not identical.
45+
3. The value ``sort=True`` is no longer allowed. A future version of pandas
46+
will properly support ``sort=True`` meaning "always sort".
5347

54-
-
55-
-
56-
-
57-
58-
**Categorical**
48+
.. _whatsnew_0241.regressions:
5949

60-
-
61-
-
62-
-
50+
Fixed Regressions
51+
~~~~~~~~~~~~~~~~~
6352

64-
**Timezones**
53+
- Fixed regression in :meth:`DataFrame.to_dict` with ``records`` orient raising an
54+
``AttributeError`` when the ``DataFrame`` contained more than 255 columns, or
55+
wrongly converting column names that were not valid python identifiers (:issue:`24939`, :issue:`24940`).
56+
- Fixed regression in :func:`read_sql` when passing certain queries with MySQL/pymysql (:issue:`24988`).
57+
- Fixed regression in :class:`Index.intersection` incorrectly sorting the values by default (:issue:`24959`).
58+
- Fixed regression in :func:`merge` when merging an empty ``DataFrame`` with multiple timezone-aware columns on one of the timezone-aware columns (:issue:`25014`).
59+
- Fixed regression in :meth:`Series.rename_axis` and :meth:`DataFrame.rename_axis` where passing ``None`` failed to remove the axis name (:issue:`25034`)
60+
- Fixed regression in :func:`to_timedelta` with `box=False` incorrectly returning a ``datetime64`` object instead of a ``timedelta64`` object (:issue:`24961`)
6561

66-
-
67-
-
68-
-
62+
.. _whatsnew_0241.bug_fixes:
6963

70-
**Timedelta**
71-
- Bug in :func:`to_timedelta` with `box=False` incorrectly returning a ``datetime64`` object instead of a ``timedelta64`` object (:issue:`24961`)
72-
-
73-
-
74-
-
64+
Bug Fixes
65+
~~~~~~~~~
7566

7667
**Reshaping**
7768

@@ -81,11 +72,9 @@ Bug Fixes
8172

8273
- Fixed the warning for implicitly registered matplotlib converters not showing. See :ref:`whatsnew_0211.converters` for more (:issue:`24963`).
8374

84-
8575
**Other**
8676

8777
- Fixed AttributeError when printing a DataFrame's HTML repr after accessing the IPython config object (:issue:`25036`)
88-
-
8978

9079
.. _whatsnew_0.241.contributors:
9180

doc/source/whatsnew/v0.25.0.rst

+4-11
Original file line numberDiff line numberDiff line change
@@ -19,17 +19,9 @@ including other versions of pandas.
1919
Other Enhancements
2020
^^^^^^^^^^^^^^^^^^
2121

22+
- :meth:`Timestamp.replace` now supports the ``fold`` argument to disambiguate DST transition times (:issue:`25017`)
2223
-
2324
-
24-
-
25-
26-
.. _whatsnew_0250.performance:
27-
28-
Performance Improvements
29-
~~~~~~~~~~~~~~~~~~~~~~~~
30-
- Significant speedup in `SparseArray` initialization that benefits most operations, fixing performance regression introduced in v0.20.0 (:issue:`24985`)
31-
32-
3325

3426
.. _whatsnew_0250.api_breaking:
3527

@@ -69,8 +61,8 @@ Removal of prior version deprecations/changes
6961
Performance Improvements
7062
~~~~~~~~~~~~~~~~~~~~~~~~
7163

72-
-
73-
-
64+
- Significant speedup in `SparseArray` initialization that benefits most operations, fixing performance regression introduced in v0.20.0 (:issue:`24985`)
65+
- `DataFrame.to_stata()` is now faster when outputting data with any string or non-native endian columns (:issue:`25045`)
7466
-
7567

7668

@@ -165,6 +157,7 @@ MultiIndex
165157
I/O
166158
^^^
167159

160+
- Fixed bug in missing text when using :meth:`to_clipboard` if copying utf-16 characters in Python 3 on Windows (:issue:`25040`)
168161
-
169162
-
170163
-

pandas/_libs/lib.pyx

+2-1
Original file line numberDiff line numberDiff line change
@@ -233,10 +233,11 @@ def fast_unique_multiple(list arrays, sort: bool=True):
233233
if val not in table:
234234
table[val] = stub
235235
uniques.append(val)
236-
if sort:
236+
if sort is None:
237237
try:
238238
uniques.sort()
239239
except Exception:
240+
# TODO: RuntimeWarning?
240241
pass
241242

242243
return uniques

pandas/_libs/tslibs/nattype.pyx

-1
Original file line numberDiff line numberDiff line change
@@ -669,7 +669,6 @@ class NaTType(_NaT):
669669
nanosecond : int, optional
670670
tzinfo : tz-convertible, optional
671671
fold : int, optional, default is 0
672-
added in 3.6, NotImplemented
673672
674673
Returns
675674
-------

pandas/_libs/tslibs/offsets.pyx

+16
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ from numpy cimport int64_t
1818
cnp.import_array()
1919

2020

21+
from pandas._libs.tslibs cimport util
2122
from pandas._libs.tslibs.util cimport is_string_object, is_integer_object
2223

2324
from pandas._libs.tslibs.ccalendar import MONTHS, DAYS
@@ -408,6 +409,10 @@ class _BaseOffset(object):
408409
return self.apply(other)
409410

410411
def __mul__(self, other):
412+
if hasattr(other, "_typ"):
413+
return NotImplemented
414+
if util.is_array(other):
415+
return np.array([self * x for x in other])
411416
return type(self)(n=other * self.n, normalize=self.normalize,
412417
**self.kwds)
413418

@@ -458,6 +463,9 @@ class _BaseOffset(object):
458463
TypeError if `int(n)` raises
459464
ValueError if n != int(n)
460465
"""
466+
if util.is_timedelta64_object(n):
467+
raise TypeError('`n` argument must be an integer, '
468+
'got {ntype}'.format(ntype=type(n)))
461469
try:
462470
nint = int(n)
463471
except (ValueError, TypeError):
@@ -533,12 +541,20 @@ class _Tick(object):
533541
can do isinstance checks on _Tick and avoid importing tseries.offsets
534542
"""
535543

544+
# ensure that reversed-ops with numpy scalars return NotImplemented
545+
__array_priority__ = 1000
546+
536547
def __truediv__(self, other):
537548
result = self.delta.__truediv__(other)
538549
return _wrap_timedelta_result(result)
539550

551+
def __rtruediv__(self, other):
552+
result = self.delta.__rtruediv__(other)
553+
return _wrap_timedelta_result(result)
554+
540555
if PY2:
541556
__div__ = __truediv__
557+
__rdiv__ = __rtruediv__
542558

543559

544560
# ----------------------------------------------------------------------

pandas/_libs/tslibs/timestamps.pyx

+11-6
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
# -*- coding: utf-8 -*-
2+
import sys
23
import warnings
34

45
from cpython cimport (PyObject_RichCompareBool, PyObject_RichCompare,
@@ -44,10 +45,11 @@ from pandas._libs.tslibs.timezones import UTC
4445
# Constants
4546
_zero_time = datetime_time(0, 0)
4647
_no_input = object()
47-
48+
PY36 = sys.version_info >= (3, 6)
4849

4950
# ----------------------------------------------------------------------
5051

52+
5153
def maybe_integer_op_deprecated(obj):
5254
# GH#22535 add/sub of integers and int-arrays is deprecated
5355
if obj.freq is not None:
@@ -1203,7 +1205,6 @@ class Timestamp(_Timestamp):
12031205
nanosecond : int, optional
12041206
tzinfo : tz-convertible, optional
12051207
fold : int, optional, default is 0
1206-
added in 3.6, NotImplemented
12071208
12081209
Returns
12091210
-------
@@ -1260,12 +1261,16 @@ class Timestamp(_Timestamp):
12601261
# see GH#18319
12611262
ts_input = _tzinfo.localize(datetime(dts.year, dts.month, dts.day,
12621263
dts.hour, dts.min, dts.sec,
1263-
dts.us))
1264+
dts.us),
1265+
is_dst=not bool(fold))
12641266
_tzinfo = ts_input.tzinfo
12651267
else:
1266-
ts_input = datetime(dts.year, dts.month, dts.day,
1267-
dts.hour, dts.min, dts.sec, dts.us,
1268-
tzinfo=_tzinfo)
1268+
kwargs = {'year': dts.year, 'month': dts.month, 'day': dts.day,
1269+
'hour': dts.hour, 'minute': dts.min, 'second': dts.sec,
1270+
'microsecond': dts.us, 'tzinfo': _tzinfo}
1271+
if PY36:
1272+
kwargs['fold'] = fold
1273+
ts_input = datetime(**kwargs)
12691274

12701275
ts = convert_datetime_to_tsobject(ts_input, _tzinfo)
12711276
value = ts.value + (dts.ps // 1000)

pandas/compat/pickle_compat.py

+1-2
Original file line numberDiff line numberDiff line change
@@ -201,7 +201,7 @@ def load_newobj_ex(self):
201201
pass
202202

203203

204-
def load(fh, encoding=None, compat=False, is_verbose=False):
204+
def load(fh, encoding=None, is_verbose=False):
205205
"""load a pickle, with a provided encoding
206206
207207
if compat is True:
@@ -212,7 +212,6 @@ def load(fh, encoding=None, compat=False, is_verbose=False):
212212
----------
213213
fh : a filelike object
214214
encoding : an optional encoding
215-
compat : provide Series compatibility mode, boolean, default False
216215
is_verbose : show exception output
217216
"""
218217

pandas/core/accessor.py

+11-7
Original file line numberDiff line numberDiff line change
@@ -16,11 +16,15 @@ class DirNamesMixin(object):
1616
['asobject', 'base', 'data', 'flags', 'itemsize', 'strides'])
1717

1818
def _dir_deletions(self):
19-
""" delete unwanted __dir__ for this object """
19+
"""
20+
Delete unwanted __dir__ for this object.
21+
"""
2022
return self._accessors | self._deprecations
2123

2224
def _dir_additions(self):
23-
""" add additional __dir__ for this object """
25+
"""
26+
Add additional __dir__ for this object.
27+
"""
2428
rv = set()
2529
for accessor in self._accessors:
2630
try:
@@ -33,7 +37,7 @@ def _dir_additions(self):
3337
def __dir__(self):
3438
"""
3539
Provide method name lookup and completion
36-
Only provide 'public' methods
40+
Only provide 'public' methods.
3741
"""
3842
rv = set(dir(type(self)))
3943
rv = (rv - self._dir_deletions()) | self._dir_additions()
@@ -42,7 +46,7 @@ def __dir__(self):
4246

4347
class PandasDelegate(object):
4448
"""
45-
an abstract base class for delegating methods/properties
49+
An abstract base class for delegating methods/properties.
4650
"""
4751

4852
def _delegate_property_get(self, name, *args, **kwargs):
@@ -65,10 +69,10 @@ def _add_delegate_accessors(cls, delegate, accessors, typ,
6569
----------
6670
cls : the class to add the methods/properties to
6771
delegate : the class to get methods/properties & doc-strings
68-
acccessors : string list of accessors to add
72+
accessors : string list of accessors to add
6973
typ : 'property' or 'method'
7074
overwrite : boolean, default False
71-
overwrite the method/property in the target class if it exists
75+
overwrite the method/property in the target class if it exists.
7276
"""
7377

7478
def _create_delegator_property(name):
@@ -117,7 +121,7 @@ def delegate_names(delegate, accessors, typ, overwrite=False):
117121
----------
118122
delegate : object
119123
the class to get methods/properties & doc-strings
120-
acccessors : Sequence[str]
124+
accessors : Sequence[str]
121125
List of accessor to add
122126
typ : {'property', 'method'}
123127
overwrite : boolean, default False

pandas/core/arrays/categorical.py

+2-5
Original file line numberDiff line numberDiff line change
@@ -2321,8 +2321,7 @@ def _values_for_factorize(self):
23212321
@classmethod
23222322
def _from_factorized(cls, uniques, original):
23232323
return original._constructor(original.categories.take(uniques),
2324-
categories=original.categories,
2325-
ordered=original.ordered)
2324+
dtype=original.dtype)
23262325

23272326
def equals(self, other):
23282327
"""
@@ -2674,9 +2673,7 @@ def _factorize_from_iterable(values):
26742673
if is_categorical(values):
26752674
if isinstance(values, (ABCCategoricalIndex, ABCSeries)):
26762675
values = values._values
2677-
categories = CategoricalIndex(values.categories,
2678-
categories=values.categories,
2679-
ordered=values.ordered)
2676+
categories = CategoricalIndex(values.categories, dtype=values.dtype)
26802677
codes = values.codes
26812678
else:
26822679
# The value of ordered is irrelevant since we don't use cat as such,

pandas/core/base.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -1234,7 +1234,7 @@ def value_counts(self, normalize=False, sort=True, ascending=False,
12341234
If True then the object returned will contain the relative
12351235
frequencies of the unique values.
12361236
sort : boolean, default True
1237-
Sort by values.
1237+
Sort by frequencies.
12381238
ascending : boolean, default False
12391239
Sort in ascending order.
12401240
bins : integer, optional

0 commit comments

Comments
 (0)