Skip to content

Commit ca286f7

Browse files
committed
Merge remote-tracking branch 'upstream/master' into ea-unstack
2 parents a9e6263 + 1651a10 commit ca286f7

19 files changed

+93
-149
lines changed

ci/requirements-optional-conda.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
beautifulsoup4>=4.2.1
22
blosc
33
bottleneck>=1.2.0
4-
fastparquet
4+
fastparquet>=0.1.2
55
gcsfs
66
html5lib
77
ipython>=5.6.0
@@ -12,7 +12,7 @@ matplotlib>=2.0.0
1212
nbsphinx
1313
numexpr>=2.6.1
1414
openpyxl
15-
pyarrow>=0.4.1
15+
pyarrow>=0.7.0
1616
pymysql
1717
pytables>=3.4.2
1818
pytest-cov

ci/requirements-optional-pip.txt

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
beautifulsoup4>=4.2.1
44
blosc
55
bottleneck>=1.2.0
6-
fastparquet
6+
fastparquet>=0.1.2
77
gcsfs
88
html5lib
99
ipython>=5.6.0
@@ -14,9 +14,9 @@ matplotlib>=2.0.0
1414
nbsphinx
1515
numexpr>=2.6.1
1616
openpyxl
17-
pyarrow>=0.4.1
17+
pyarrow>=0.7.0
1818
pymysql
19-
tables
19+
pytables>=3.4.2
2020
pytest-cov
2121
pytest-xdist
2222
s3fs
@@ -27,4 +27,4 @@ statsmodels
2727
xarray
2828
xlrd
2929
xlsxwriter
30-
xlwt
30+
xlwt

ci/travis-27.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ dependencies:
2222
- patsy
2323
- psycopg2
2424
- py
25-
- pyarrow=0.4.1
25+
- pyarrow=0.7.0
2626
- PyCrypto
2727
- pymysql=0.6.3
2828
- pytables

doc/source/install.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -258,8 +258,8 @@ Optional Dependencies
258258
* `SciPy <http://www.scipy.org>`__: miscellaneous statistical functions, Version 0.18.1 or higher
259259
* `xarray <http://xarray.pydata.org>`__: pandas like handling for > 2 dims, needed for converting Panels to xarray objects. Version 0.7.0 or higher is recommended.
260260
* `PyTables <http://www.pytables.org>`__: necessary for HDF5-based storage, Version 3.4.2 or higher
261-
* `pyarrow <http://arrow.apache.org/docs/python/>`__ (>= 0.4.1): necessary for feather-based storage.
262-
* `Apache Parquet <https://parquet.apache.org/>`__, either `pyarrow <http://arrow.apache.org/docs/python/>`__ (>= 0.4.1) or `fastparquet <https://fastparquet.readthedocs.io/en/latest>`__ (>= 0.0.6) for parquet-based storage. The `snappy <https://pypi.org/project/python-snappy>`__ and `brotli <https://pypi.org/project/brotlipy>`__ are available for compression support.
261+
* `pyarrow <http://arrow.apache.org/docs/python/>`__ (>= 0.7.0): necessary for feather-based storage.
262+
* `Apache Parquet <https://parquet.apache.org/>`__, either `pyarrow <http://arrow.apache.org/docs/python/>`__ (>= 0.7.0) or `fastparquet <https://fastparquet.readthedocs.io/en/latest>`__ (>= 0.1.2) for parquet-based storage. The `snappy <https://pypi.org/project/python-snappy>`__ and `brotli <https://pypi.org/project/brotlipy>`__ are available for compression support.
263263
* `SQLAlchemy <http://www.sqlalchemy.org>`__: for SQL database support. Version 0.8.1 or higher recommended. Besides SQLAlchemy, you also need a database specific driver. You can find an overview of supported drivers for each SQL dialect in the `SQLAlchemy docs <http://docs.sqlalchemy.org/en/latest/dialects/index.html>`__. Some common drivers are:
264264

265265
* `psycopg2 <http://initd.org/psycopg/>`__: for PostgreSQL

doc/source/whatsnew/v0.24.0.txt

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -250,7 +250,7 @@ Backwards incompatible API changes
250250
Dependencies have increased minimum versions
251251
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
252252

253-
We have updated our minimum supported versions of dependencies (:issue:`21242`).
253+
We have updated our minimum supported versions of dependencies (:issue:`21242`, `18742`).
254254
If installed, we now require:
255255

256256
+-----------------+-----------------+----------+
@@ -268,6 +268,10 @@ If installed, we now require:
268268
+-----------------+-----------------+----------+
269269
| scipy | 0.18.1 | |
270270
+-----------------+-----------------+----------+
271+
| pyarrow | 0.7.0 | |
272+
+-----------------+-----------------+----------+
273+
| fastparquet | 0.1.2 | |
274+
+-----------------+-----------------+----------+
271275

272276
Additionally we no longer depend on `feather-format` for feather based storage
273277
and replaced it with references to `pyarrow` (:issue:`21639` and :issue:`23053`).
@@ -1211,6 +1215,7 @@ Indexing
12111215
- :class:`Index` no longer mangles ``None``, ``NaN`` and ``NaT``, i.e. they are treated as three different keys. However, for numeric Index all three are still coerced to a ``NaN`` (:issue:`22332`)
12121216
- Bug in `scalar in Index` if scalar is a float while the ``Index`` is of integer dtype (:issue:`22085`)
12131217
- Bug in `MultiIndex.set_levels` when levels value is not subscriptable (:issue:`23273`)
1218+
- Bug where setting a timedelta column by ``Index`` causes it to be casted to double, and therefore lose precision (:issue:`23511`)
12141219

12151220
Missing
12161221
^^^^^^^

pandas/core/internals/blocks.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2173,9 +2173,9 @@ def _box_func(self):
21732173
def _can_hold_element(self, element):
21742174
tipo = maybe_infer_dtype_type(element)
21752175
if tipo is not None:
2176-
return issubclass(tipo.type, np.timedelta64)
2176+
return issubclass(tipo.type, (np.timedelta64, np.int64))
21772177
return is_integer(element) or isinstance(
2178-
element, (timedelta, np.timedelta64))
2178+
element, (timedelta, np.timedelta64, np.int64))
21792179

21802180
def fillna(self, value, **kwargs):
21812181

pandas/io/parquet.py

Lines changed: 13 additions & 65 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55

66
from pandas.compat import string_types
77

8-
from pandas import DataFrame, Int64Index, RangeIndex, get_option
8+
from pandas import DataFrame, get_option
99
import pandas.core.common as com
1010

1111
from pandas.io.common import get_filepath_or_buffer, is_s3_url
@@ -89,57 +89,38 @@ def __init__(self):
8989
"\nor via pip\n"
9090
"pip install -U pyarrow\n"
9191
)
92-
if LooseVersion(pyarrow.__version__) < '0.4.1':
92+
if LooseVersion(pyarrow.__version__) < '0.7.0':
9393
raise ImportError(
94-
"pyarrow >= 0.4.1 is required for parquet support\n\n"
94+
"pyarrow >= 0.7.0 is required for parquet support\n\n"
9595
"you can install via conda\n"
9696
"conda install pyarrow -c conda-forge\n"
9797
"\nor via pip\n"
9898
"pip install -U pyarrow\n"
9999
)
100100

101-
self._pyarrow_lt_060 = (
102-
LooseVersion(pyarrow.__version__) < LooseVersion('0.6.0'))
103-
self._pyarrow_lt_070 = (
104-
LooseVersion(pyarrow.__version__) < LooseVersion('0.7.0'))
105-
106101
self.api = pyarrow
107102

108103
def write(self, df, path, compression='snappy',
109104
coerce_timestamps='ms', index=None, **kwargs):
110105
self.validate_dataframe(df)
111-
112-
# Only validate the index if we're writing it.
113-
if self._pyarrow_lt_070 and index is not False:
114-
self._validate_write_lt_070(df)
115106
path, _, _, _ = get_filepath_or_buffer(path, mode='wb')
116107

117108
if index is None:
118109
from_pandas_kwargs = {}
119110
else:
120111
from_pandas_kwargs = {'preserve_index': index}
121112

122-
if self._pyarrow_lt_060:
123-
table = self.api.Table.from_pandas(df, timestamps_to_ms=True,
124-
**from_pandas_kwargs)
125-
self.api.parquet.write_table(
126-
table, path, compression=compression, **kwargs)
127-
128-
else:
129-
table = self.api.Table.from_pandas(df, **from_pandas_kwargs)
130-
self.api.parquet.write_table(
131-
table, path, compression=compression,
132-
coerce_timestamps=coerce_timestamps, **kwargs)
113+
table = self.api.Table.from_pandas(df, **from_pandas_kwargs)
114+
self.api.parquet.write_table(
115+
table, path, compression=compression,
116+
coerce_timestamps=coerce_timestamps, **kwargs)
133117

134118
def read(self, path, columns=None, **kwargs):
135119
path, _, _, should_close = get_filepath_or_buffer(path)
136-
if self._pyarrow_lt_070:
137-
result = self.api.parquet.read_pandas(path, columns=columns,
138-
**kwargs).to_pandas()
139-
else:
140-
kwargs['use_pandas_metadata'] = True
141-
result = self.api.parquet.read_table(path, columns=columns,
142-
**kwargs).to_pandas()
120+
121+
kwargs['use_pandas_metadata'] = True
122+
result = self.api.parquet.read_table(path, columns=columns,
123+
**kwargs).to_pandas()
143124
if should_close:
144125
try:
145126
path.close()
@@ -148,39 +129,6 @@ def read(self, path, columns=None, **kwargs):
148129

149130
return result
150131

151-
def _validate_write_lt_070(self, df):
152-
# Compatibility shim for pyarrow < 0.7.0
153-
# TODO: Remove in pandas 0.23.0
154-
from pandas.core.indexes.multi import MultiIndex
155-
if isinstance(df.index, MultiIndex):
156-
msg = (
157-
"Multi-index DataFrames are only supported "
158-
"with pyarrow >= 0.7.0"
159-
)
160-
raise ValueError(msg)
161-
# Validate index
162-
if not isinstance(df.index, Int64Index):
163-
msg = (
164-
"pyarrow < 0.7.0 does not support serializing {} for the "
165-
"index; you can .reset_index() to make the index into "
166-
"column(s), or install the latest version of pyarrow or "
167-
"fastparquet."
168-
)
169-
raise ValueError(msg.format(type(df.index)))
170-
if not df.index.equals(RangeIndex(len(df))):
171-
raise ValueError(
172-
"pyarrow < 0.7.0 does not support serializing a non-default "
173-
"index; you can .reset_index() to make the index into "
174-
"column(s), or install the latest version of pyarrow or "
175-
"fastparquet."
176-
)
177-
if df.index.name is not None:
178-
raise ValueError(
179-
"pyarrow < 0.7.0 does not serialize indexes with a name; you "
180-
"can set the index.name to None or install the latest version "
181-
"of pyarrow or fastparquet."
182-
)
183-
184132

185133
class FastParquetImpl(BaseImpl):
186134

@@ -197,9 +145,9 @@ def __init__(self):
197145
"\nor via pip\n"
198146
"pip install -U fastparquet"
199147
)
200-
if LooseVersion(fastparquet.__version__) < '0.1.0':
148+
if LooseVersion(fastparquet.__version__) < '0.1.2':
201149
raise ImportError(
202-
"fastparquet >= 0.1.0 is required for parquet "
150+
"fastparquet >= 0.1.2 is required for parquet "
203151
"support\n\n"
204152
"you can install via conda\n"
205153
"conda install fastparquet -c conda-forge\n"

pandas/tests/arrays/categorical/test_missing.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,13 @@
44
import numpy as np
55
import pytest
66

7-
import pandas.util.testing as tm
8-
from pandas import Categorical, Index, isna
97
from pandas.compat import lrange
8+
109
from pandas.core.dtypes.dtypes import CategoricalDtype
1110

11+
from pandas import Categorical, Index, isna
12+
import pandas.util.testing as tm
13+
1214

1315
class TestCategoricalMissing(object):
1416

pandas/tests/arrays/categorical/test_sorting.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@
22

33
import numpy as np
44

5-
import pandas.util.testing as tm
65
from pandas import Categorical, Index
6+
import pandas.util.testing as tm
77

88

99
class TestCategoricalSort(object):

pandas/tests/arrays/test_datetimelike.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,9 @@
33
import pytest
44

55
import pandas as pd
6-
import pandas.util.testing as tm
76
from pandas.core.arrays import (
8-
DatetimeArrayMixin, PeriodArray, TimedeltaArrayMixin
9-
)
7+
DatetimeArrayMixin, PeriodArray, TimedeltaArrayMixin)
8+
import pandas.util.testing as tm
109

1110

1211
# TODO: more freq variants

pandas/tests/arrays/test_integer.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,16 +2,16 @@
22
import numpy as np
33
import pytest
44

5+
from pandas.core.dtypes.generic import ABCIndexClass
6+
57
import pandas as pd
6-
import pandas.util.testing as tm
78
from pandas.api.types import is_float, is_float_dtype, is_integer, is_scalar
89
from pandas.core.arrays import IntegerArray, integer_array
910
from pandas.core.arrays.integer import (
1011
Int8Dtype, Int16Dtype, Int32Dtype, Int64Dtype, UInt8Dtype, UInt16Dtype,
11-
UInt32Dtype, UInt64Dtype
12-
)
13-
from pandas.core.dtypes.generic import ABCIndexClass
12+
UInt32Dtype, UInt64Dtype)
1413
from pandas.tests.extension.base import BaseOpsUtil
14+
import pandas.util.testing as tm
1515

1616

1717
def make_data():

pandas/tests/indexing/test_chaining_and_caching.py

Lines changed: 17 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -337,13 +337,24 @@ def f():
337337
df2['y'] = ['g', 'h', 'i']
338338

339339
def test_detect_chained_assignment_warnings(self):
340+
with option_context("chained_assignment", "warn"):
341+
df = DataFrame({"A": ["aaa", "bbb", "ccc"], "B": [1, 2, 3]})
340342

341-
# warnings
342-
with option_context('chained_assignment', 'warn'):
343-
df = DataFrame({'A': ['aaa', 'bbb', 'ccc'], 'B': [1, 2, 3]})
344-
with tm.assert_produces_warning(
345-
expected_warning=com.SettingWithCopyWarning):
346-
df.loc[0]['A'] = 111
343+
with tm.assert_produces_warning(com.SettingWithCopyWarning):
344+
df.loc[0]["A"] = 111
345+
346+
def test_detect_chained_assignment_warnings_filter_and_dupe_cols(self):
347+
# xref gh-13017.
348+
with option_context("chained_assignment", "warn"):
349+
df = pd.DataFrame([[1, 2, 3], [4, 5, 6], [7, 8, -9]],
350+
columns=["a", "a", "c"])
351+
352+
with tm.assert_produces_warning(com.SettingWithCopyWarning):
353+
df.c.loc[df.c > 0] = None
354+
355+
expected = pd.DataFrame([[1, 2, 3], [4, 5, 6], [7, 8, -9]],
356+
columns=["a", "a", "c"])
357+
tm.assert_frame_equal(df, expected)
347358

348359
def test_chained_getitem_with_lists(self):
349360

pandas/tests/indexing/test_timedelta.py

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,3 +80,18 @@ def test_numpy_timedelta_scalar_indexing(self, start, stop,
8080
result = s.loc[slice(start, stop)]
8181
expected = s.iloc[expected_slice]
8282
tm.assert_series_equal(result, expected)
83+
84+
def test_roundtrip_thru_setitem(self):
85+
# PR 23462
86+
dt1 = pd.Timedelta(0)
87+
dt2 = pd.Timedelta(28767471428571405)
88+
df = pd.DataFrame({'dt': pd.Series([dt1, dt2])})
89+
df_copy = df.copy()
90+
s = pd.Series([dt1])
91+
92+
expected = df['dt'].iloc[1].value
93+
df.loc[[True, False]] = s
94+
result = df['dt'].iloc[1].value
95+
96+
assert expected == result
97+
tm.assert_frame_equal(df, df_copy)

0 commit comments

Comments
 (0)