Skip to content

Commit 2f28638

Browse files
committed
Merge remote-tracking branch 'upstream/master' into ea-unstack
2 parents ca286f7 + bd98841 commit 2f28638

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+419
-216
lines changed

ci/code_checks.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ if [[ -z "$CHECK" || "$CHECK" == "lint" ]]; then
4949
# Note: this grep pattern is (intended to be) equivalent to the python
5050
# regex r'(?<![ ->])> '
5151
MSG='Linting .pyx code for spacing conventions in casting' ; echo $MSG
52-
! grep -r -E --include '*.pyx' --include '*.pxi.in' '> ' pandas/_libs | grep -v '[ ->]> '
52+
! grep -r -E --include '*.pyx' --include '*.pxi.in' '[a-zA-Z0-9*]> ' pandas/_libs
5353
RET=$(($RET + $?)) ; echo $MSG "DONE"
5454

5555
# readability/casting: Warnings about C casting instead of C++ casting

ci/travis-37-numpydev.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,5 +13,5 @@ dependencies:
1313
- "git+git://github.com/dateutil/dateutil.git"
1414
- "-f https://7933911d6844c6c53a7d-47bd50c35cd79bd838daf386af554a83.ssl.cf2.rackcdn.com"
1515
- "--pre"
16-
- "numpy<=1.16.0.dev0+20181015190246"
16+
- "numpy"
1717
- "scipy"

doc/source/whatsnew/v0.24.0.txt

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1047,7 +1047,7 @@ Removal of prior version deprecations/changes
10471047
Performance Improvements
10481048
~~~~~~~~~~~~~~~~~~~~~~~~
10491049

1050-
- Slicing Series and Dataframes with an monotonically increasing :class:`CategoricalIndex`
1050+
- Slicing Series and DataFrames with an monotonically increasing :class:`CategoricalIndex`
10511051
is now very fast and has speed comparable to slicing with an ``Int64Index``.
10521052
The speed increase is both when indexing by label (using .loc) and position(.iloc) (:issue:`20395`)
10531053
Slicing a monotonically increasing :class:`CategoricalIndex` itself (i.e. ``ci[1000:2000]``)
@@ -1119,6 +1119,7 @@ Datetimelike
11191119
- Bug in :func:`DataFrame.combine` with datetimelike values raising a TypeError (:issue:`23079`)
11201120
- Bug in :func:`date_range` with frequency of ``Day`` or higher where dates sufficiently far in the future could wrap around to the past instead of raising ``OutOfBoundsDatetime`` (:issue:`14187`)
11211121
- Bug in :class:`PeriodIndex` with attribute ``freq.n`` greater than 1 where adding a :class:`DateOffset` object would return incorrect results (:issue:`23215`)
1122+
- Bug in :class:`Series` that interpreted string indices as lists of characters when setting datetimelike values (:issue:`23451`)
11221123

11231124
Timedelta
11241125
^^^^^^^^^
@@ -1132,6 +1133,8 @@ Timedelta
11321133
- Fixed bug in adding a :class:`DataFrame` with all-`timedelta64[ns]` dtypes to a :class:`DataFrame` with all-integer dtypes returning incorrect results instead of raising ``TypeError`` (:issue:`22696`)
11331134
- Bug in :class:`TimedeltaIndex` where adding a timezone-aware datetime scalar incorrectly returned a timezone-naive :class:`DatetimeIndex` (:issue:`23215`)
11341135
- Bug in :class:`TimedeltaIndex` where adding ``np.timedelta64('NaT')`` incorrectly returned an all-`NaT` :class:`DatetimeIndex` instead of an all-`NaT` :class:`TimedeltaIndex` (:issue:`23215`)
1136+
- Bug in :class:`Timedelta` and :func:`to_timedelta()` have inconsistencies in supported unit string (:issue:`21762`)
1137+
11351138

11361139
Timezones
11371140
^^^^^^^^^
@@ -1149,7 +1152,7 @@ Timezones
11491152
- Fixed bug where :meth:`DataFrame.describe` and :meth:`Series.describe` on tz-aware datetimes did not show `first` and `last` result (:issue:`21328`)
11501153
- Bug in :class:`DatetimeIndex` comparisons failing to raise ``TypeError`` when comparing timezone-aware ``DatetimeIndex`` against ``np.datetime64`` (:issue:`22074`)
11511154
- Bug in ``DataFrame`` assignment with a timezone-aware scalar (:issue:`19843`)
1152-
- Bug in :func:`Dataframe.asof` that raised a ``TypeError`` when attempting to compare tz-naive and tz-aware timestamps (:issue:`21194`)
1155+
- Bug in :func:`DataFrame.asof` that raised a ``TypeError`` when attempting to compare tz-naive and tz-aware timestamps (:issue:`21194`)
11531156
- Bug when constructing a :class:`DatetimeIndex` with :class:`Timestamp`s constructed with the ``replace`` method across DST (:issue:`18785`)
11541157
- Bug when setting a new value with :meth:`DataFrame.loc` with a :class:`DatetimeIndex` with a DST transition (:issue:`18308`, :issue:`20724`)
11551158
- Bug in :meth:`DatetimeIndex.unique` that did not re-localize tz-aware dates correctly (:issue:`21737`)
@@ -1282,6 +1285,7 @@ Notice how we now instead output ``np.nan`` itself instead of a stringified form
12821285
- Bug in :func:`to_string()` that broke column alignment when ``index=False`` and width of first column's values is greater than the width of first column's header (:issue:`16839`, :issue:`13032`)
12831286
- Bug in :func:`DataFrame.to_csv` where a single level MultiIndex incorrectly wrote a tuple. Now just the value of the index is written (:issue:`19589`).
12841287
- Bug in :meth:`HDFStore.append` when appending a :class:`DataFrame` with an empty string column and ``min_itemsize`` < 8 (:issue:`12242`)
1288+
- Bug in :meth:`read_csv()` in which :class:`MultiIndex` index names were being improperly handled in the cases when they were not provided (:issue:`23484`)
12851289

12861290
Plotting
12871291
^^^^^^^^
@@ -1305,13 +1309,15 @@ Groupby/Resample/Rolling
13051309
- :func:`RollingGroupby.agg` and :func:`ExpandingGroupby.agg` now support multiple aggregation functions as parameters (:issue:`15072`)
13061310
- Bug in :meth:`DataFrame.resample` and :meth:`Series.resample` when resampling by a weekly offset (``'W'``) across a DST transition (:issue:`9119`, :issue:`21459`)
13071311
- Bug in :meth:`DataFrame.expanding` in which the ``axis`` argument was not being respected during aggregations (:issue:`23372`)
1312+
- Bug in :meth:`pandas.core.groupby.DataFrameGroupBy.transform` which caused missing values when the input function can accept a :class:`DataFrame` but renames it (:issue:`23455`).
13081313

13091314
Reshaping
13101315
^^^^^^^^^
13111316

13121317
- Bug in :func:`pandas.concat` when joining resampled DataFrames with timezone aware index (:issue:`13783`)
13131318
- Bug in :meth:`Series.combine_first` with ``datetime64[ns, tz]`` dtype which would return tz-naive result (:issue:`21469`)
13141319
- Bug in :meth:`Series.where` and :meth:`DataFrame.where` with ``datetime64[ns, tz]`` dtype (:issue:`21546`)
1320+
- Bug in :meth:`DataFrame.where` with an empty DataFrame and empty ``cond`` having non-bool dtype (:issue:`21947`)
13151321
- Bug in :meth:`Series.mask` and :meth:`DataFrame.mask` with ``list`` conditionals (:issue:`21891`)
13161322
- Bug in :meth:`DataFrame.replace` raises RecursionError when converting OutOfBounds ``datetime64[ns, tz]`` (:issue:`20380`)
13171323
- :func:`pandas.core.groupby.GroupBy.rank` now raises a ``ValueError`` when an invalid value is passed for argument ``na_option`` (:issue:`22124`)

pandas/_libs/algos.pyx

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -128,11 +128,11 @@ def is_lexsorted(list_of_arrays: list) -> bint:
128128
nlevels = len(list_of_arrays)
129129
n = len(list_of_arrays[0])
130130

131-
cdef int64_t **vecs = <int64_t**> malloc(nlevels * sizeof(int64_t*))
131+
cdef int64_t **vecs = <int64_t**>malloc(nlevels * sizeof(int64_t*))
132132
for i in range(nlevels):
133133
arr = list_of_arrays[i]
134134
assert arr.dtype.name == 'int64'
135-
vecs[i] = <int64_t*> cnp.PyArray_DATA(arr)
135+
vecs[i] = <int64_t*>cnp.PyArray_DATA(arr)
136136

137137
# Assume uniqueness??
138138
with nogil:
@@ -409,7 +409,7 @@ def pad(ndarray[algos_t] old, ndarray[algos_t] new, limit=None):
409409
nleft = len(old)
410410
nright = len(new)
411411
indexer = np.empty(nright, dtype=np.int64)
412-
indexer.fill(-1)
412+
indexer[:] = -1
413413

414414
if limit is None:
415415
lim = nright
@@ -607,7 +607,7 @@ def backfill(ndarray[algos_t] old, ndarray[algos_t] new, limit=None):
607607
nleft = len(old)
608608
nright = len(new)
609609
indexer = np.empty(nright, dtype=np.int64)
610-
indexer.fill(-1)
610+
indexer[:] = -1
611611

612612
if limit is None:
613613
lim = nright

pandas/_libs/algos_rank_helper.pxi.in

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,9 @@ Template for each `dtype` helper function for rank
44
WARNING: DO NOT edit .pxi FILE directly, .pxi is generated from .pxi.in
55
"""
66

7-
#----------------------------------------------------------------------
7+
# ----------------------------------------------------------------------
88
# rank_1d, rank_2d
9-
#----------------------------------------------------------------------
9+
# ----------------------------------------------------------------------
1010

1111
{{py:
1212

pandas/_libs/groupby.pyx

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ cdef inline float64_t median_linear(float64_t* a, int n) nogil:
4444
if na_count == n:
4545
return NaN
4646

47-
tmp = <float64_t*> malloc((n - na_count) * sizeof(float64_t))
47+
tmp = <float64_t*>malloc((n - na_count) * sizeof(float64_t))
4848

4949
j = 0
5050
for i in range(n):
@@ -121,7 +121,7 @@ def group_median_float64(ndarray[float64_t, ndim=2] out,
121121
counts[:] = _counts[1:]
122122

123123
data = np.empty((K, N), dtype=np.float64)
124-
ptr = <float64_t*> cnp.PyArray_DATA(data)
124+
ptr = <float64_t*>cnp.PyArray_DATA(data)
125125

126126
take_2d_axis1_float64_float64(values.T, indexer, out=data)
127127

@@ -370,7 +370,7 @@ def group_any_all(ndarray[uint8_t] out,
370370
else:
371371
raise ValueError("'bool_func' must be either 'any' or 'all'!")
372372

373-
out.fill(1 - flag_val)
373+
out[:] = 1 - flag_val
374374

375375
with nogil:
376376
for i in range(N):

pandas/_libs/groupby_helper.pxi.in

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,9 @@ cdef extern from "numpy/npy_math.h":
88
double NAN "NPY_NAN"
99
_int64_max = np.iinfo(np.int64).max
1010

11-
#----------------------------------------------------------------------
11+
# ----------------------------------------------------------------------
1212
# group_add, group_prod, group_var, group_mean, group_ohlc
13-
#----------------------------------------------------------------------
13+
# ----------------------------------------------------------------------
1414

1515
{{py:
1616

@@ -246,7 +246,7 @@ def group_ohlc_{{name}}(ndarray[{{c_type}}, ndim=2] out,
246246
if K > 1:
247247
raise NotImplementedError("Argument 'values' must have only "
248248
"one dimension")
249-
out.fill(np.nan)
249+
out[:] = np.nan
250250

251251
with nogil:
252252
for i in range(N):
@@ -629,10 +629,10 @@ def group_max(ndarray[groupby_t, ndim=2] out,
629629
maxx = np.empty_like(out)
630630
if groupby_t is int64_t:
631631
# Note: evaluated at compile-time
632-
maxx.fill(-_int64_max)
632+
maxx[:] = -_int64_max
633633
nan_val = iNaT
634634
else:
635-
maxx.fill(-np.inf)
635+
maxx[:] = -np.inf
636636
nan_val = NAN
637637

638638
N, K = (<object>values).shape
@@ -691,10 +691,10 @@ def group_min(ndarray[groupby_t, ndim=2] out,
691691

692692
minx = np.empty_like(out)
693693
if groupby_t is int64_t:
694-
minx.fill(_int64_max)
694+
minx[:] = _int64_max
695695
nan_val = iNaT
696696
else:
697-
minx.fill(np.inf)
697+
minx[:] = np.inf
698698
nan_val = NAN
699699

700700
N, K = (<object>values).shape
@@ -747,9 +747,9 @@ def group_cummin(ndarray[groupby_t, ndim=2] out,
747747
N, K = (<object>values).shape
748748
accum = np.empty_like(values)
749749
if groupby_t is int64_t:
750-
accum.fill(_int64_max)
750+
accum[:] = _int64_max
751751
else:
752-
accum.fill(np.inf)
752+
accum[:] = np.inf
753753

754754
with nogil:
755755
for i in range(N):
@@ -795,9 +795,9 @@ def group_cummax(ndarray[groupby_t, ndim=2] out,
795795
N, K = (<object>values).shape
796796
accum = np.empty_like(values)
797797
if groupby_t is int64_t:
798-
accum.fill(-_int64_max)
798+
accum[:] = -_int64_max
799799
else:
800-
accum.fill(-np.inf)
800+
accum[:] = -np.inf
801801

802802
with nogil:
803803
for i in range(N):

pandas/_libs/hashing.pyx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -54,8 +54,8 @@ def hash_object_array(object[:] arr, object key, object encoding='utf8'):
5454
n = len(arr)
5555

5656
# create an array of bytes
57-
vecs = <char **> malloc(n * sizeof(char *))
58-
lens = <uint64_t*> malloc(n * sizeof(uint64_t))
57+
vecs = <char **>malloc(n * sizeof(char *))
58+
lens = <uint64_t*>malloc(n * sizeof(uint64_t))
5959

6060
for i in range(n):
6161
val = arr[i]

pandas/_libs/hashtable_class_helper.pxi.in

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -590,13 +590,13 @@ cdef class StringHashTable(HashTable):
590590
cdef:
591591
Py_ssize_t i, n = len(values)
592592
ndarray[int64_t] labels = np.empty(n, dtype=np.int64)
593-
int64_t *resbuf = <int64_t*> labels.data
593+
int64_t *resbuf = <int64_t*>labels.data
594594
khiter_t k
595595
kh_str_t *table = self.table
596596
const char *v
597597
const char **vecs
598598

599-
vecs = <const char **> malloc(n * sizeof(char *))
599+
vecs = <const char **>malloc(n * sizeof(char *))
600600
for i in range(n):
601601
val = values[i]
602602
v = util.get_c_string(val)
@@ -639,7 +639,7 @@ cdef class StringHashTable(HashTable):
639639
const char *v
640640
const char **vecs
641641

642-
vecs = <const char **> malloc(n * sizeof(char *))
642+
vecs = <const char **>malloc(n * sizeof(char *))
643643
uindexer = np.empty(n, dtype=np.int64)
644644
for i in range(n):
645645
val = values[i]
@@ -674,7 +674,7 @@ cdef class StringHashTable(HashTable):
674674
int64_t[:] locs = np.empty(n, dtype=np.int64)
675675

676676
# these by-definition *must* be strings
677-
vecs = <char **> malloc(n * sizeof(char *))
677+
vecs = <char **>malloc(n * sizeof(char *))
678678
for i in range(n):
679679
val = values[i]
680680

@@ -707,7 +707,7 @@ cdef class StringHashTable(HashTable):
707707
khiter_t k
708708

709709
# these by-definition *must* be strings
710-
vecs = <const char **> malloc(n * sizeof(char *))
710+
vecs = <const char **>malloc(n * sizeof(char *))
711711
for i in range(n):
712712
val = values[i]
713713

pandas/_libs/join.pyx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -212,7 +212,7 @@ def _get_result_indexer(sorter, indexer):
212212
else:
213213
# length-0 case
214214
res = np.empty(len(indexer), dtype=np.int64)
215-
res.fill(-1)
215+
res[:] = -1
216216

217217
return res
218218

pandas/_libs/lib.pyx

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -347,7 +347,7 @@ def get_reverse_indexer(ndarray[int64_t] indexer, Py_ssize_t length):
347347
int64_t idx
348348

349349
rev_indexer = np.empty(length, dtype=np.int64)
350-
rev_indexer.fill(-1)
350+
rev_indexer[:] = -1
351351
for i in range(n):
352352
idx = indexer[i]
353353
if idx != -1:
@@ -1670,7 +1670,7 @@ cdef class TimedeltaValidator(TemporalValidator):
16701670

16711671

16721672
# TODO: Not used outside of tests; remove?
1673-
def is_timedelta_array(values: ndarray) -> bint:
1673+
def is_timedelta_array(values: ndarray) -> bool:
16741674
cdef:
16751675
TimedeltaValidator validator = TimedeltaValidator(len(values),
16761676
skipna=True)
@@ -1683,7 +1683,7 @@ cdef class Timedelta64Validator(TimedeltaValidator):
16831683

16841684

16851685
# TODO: Not used outside of tests; remove?
1686-
def is_timedelta64_array(values: ndarray) -> bint:
1686+
def is_timedelta64_array(values: ndarray) -> bool:
16871687
cdef:
16881688
Timedelta64Validator validator = Timedelta64Validator(len(values),
16891689
skipna=True)

pandas/_libs/missing.pyx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -278,14 +278,14 @@ def isnaobj2d_old(ndarray arr):
278278
return result.view(np.bool_)
279279

280280

281-
cpdef bint isposinf_scalar(object val):
281+
def isposinf_scalar(val: object) -> bool:
282282
if util.is_float_object(val) and val == INF:
283283
return True
284284
else:
285285
return False
286286

287287

288-
cpdef bint isneginf_scalar(object val):
288+
def isneginf_scalar(val: object) -> bool:
289289
if util.is_float_object(val) and val == NEGINF:
290290
return True
291291
else:

0 commit comments

Comments
 (0)