Skip to content

Commit 68115de

Browse files
author
MomIsBestFriend
committed
Merge remote-tracking branch 'upstream/master' into CLN-annonate-eq
2 parents e69277e + f855025 commit 68115de

File tree

117 files changed

+2920
-1477
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

117 files changed

+2920
-1477
lines changed

README.md

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -164,12 +164,11 @@ pip install pandas
164164
```
165165

166166
## Dependencies
167-
- [NumPy](https://www.numpy.org): 1.13.3 or higher
168-
- [python-dateutil](https://labix.org/python-dateutil): 2.5.0 or higher
169-
- [pytz](https://pythonhosted.org/pytz): 2015.4 or higher
167+
- [NumPy](https://www.numpy.org)
168+
- [python-dateutil](https://labix.org/python-dateutil)
169+
- [pytz](https://pythonhosted.org/pytz)
170170

171-
See the [full installation instructions](https://pandas.pydata.org/pandas-docs/stable/install.html#dependencies)
172-
for recommended and optional dependencies.
171+
See the [full installation instructions](https://pandas.pydata.org/pandas-docs/stable/install.html#dependencies) for minimum supported versions of required, recommended and optional dependencies.
173172

174173
## Installation from sources
175174
To install pandas from source you need Cython in addition to the normal

asv_bench/benchmarks/categoricals.py

Lines changed: 27 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -14,21 +14,6 @@
1414
pass
1515

1616

17-
class Concat:
18-
def setup(self):
19-
N = 10 ** 5
20-
self.s = pd.Series(list("aabbcd") * N).astype("category")
21-
22-
self.a = pd.Categorical(list("aabbcd") * N)
23-
self.b = pd.Categorical(list("bbcdjk") * N)
24-
25-
def time_concat(self):
26-
pd.concat([self.s, self.s])
27-
28-
def time_union(self):
29-
union_categoricals([self.a, self.b])
30-
31-
3217
class Constructor:
3318
def setup(self):
3419
N = 10 ** 5
@@ -77,6 +62,33 @@ def time_existing_series(self):
7762
pd.Categorical(self.series)
7863

7964

65+
class CategoricalOps:
66+
params = ["__lt__", "__le__", "__eq__", "__ne__", "__ge__", "__gt__"]
67+
param_names = ["op"]
68+
69+
def setup(self, op):
70+
N = 10 ** 5
71+
self.cat = pd.Categorical(list("aabbcd") * N, ordered=True)
72+
73+
def time_categorical_op(self, op):
74+
getattr(self.cat, op)("b")
75+
76+
77+
class Concat:
78+
def setup(self):
79+
N = 10 ** 5
80+
self.s = pd.Series(list("aabbcd") * N).astype("category")
81+
82+
self.a = pd.Categorical(list("aabbcd") * N)
83+
self.b = pd.Categorical(list("bbcdjk") * N)
84+
85+
def time_concat(self):
86+
pd.concat([self.s, self.s])
87+
88+
def time_union(self):
89+
union_categoricals([self.a, self.b])
90+
91+
8092
class ValueCounts:
8193

8294
params = [True, False]

azure-pipelines.yml

Lines changed: 0 additions & 89 deletions
Original file line numberDiff line numberDiff line change
@@ -16,95 +16,6 @@ jobs:
1616
name: Windows
1717
vmImage: vs2017-win2016
1818

19-
- job: 'Checks'
20-
pool:
21-
vmImage: ubuntu-16.04
22-
timeoutInMinutes: 90
23-
steps:
24-
- script: |
25-
echo '##vso[task.prependpath]$(HOME)/miniconda3/bin'
26-
echo '##vso[task.setvariable variable=ENV_FILE]environment.yml'
27-
echo '##vso[task.setvariable variable=AZURE]true'
28-
displayName: 'Setting environment variables'
29-
30-
# Do not require a conda environment
31-
- script: ci/code_checks.sh patterns
32-
displayName: 'Looking for unwanted patterns'
33-
condition: true
34-
35-
- script: |
36-
sudo apt-get update
37-
sudo apt-get install -y libc6-dev-i386
38-
ci/setup_env.sh
39-
displayName: 'Setup environment and build pandas'
40-
condition: true
41-
42-
# Do not require pandas
43-
- script: |
44-
source activate pandas-dev
45-
ci/code_checks.sh lint
46-
displayName: 'Linting'
47-
condition: true
48-
49-
- script: |
50-
source activate pandas-dev
51-
ci/code_checks.sh dependencies
52-
displayName: 'Dependencies consistency'
53-
condition: true
54-
55-
# Require pandas
56-
- script: |
57-
source activate pandas-dev
58-
ci/code_checks.sh code
59-
displayName: 'Checks on imported code'
60-
condition: true
61-
62-
- script: |
63-
source activate pandas-dev
64-
ci/code_checks.sh doctests
65-
displayName: 'Running doctests'
66-
condition: true
67-
68-
- script: |
69-
source activate pandas-dev
70-
ci/code_checks.sh docstrings
71-
displayName: 'Docstring validation'
72-
condition: true
73-
74-
- script: |
75-
source activate pandas-dev
76-
ci/code_checks.sh typing
77-
displayName: 'Typing validation'
78-
condition: true
79-
80-
- script: |
81-
source activate pandas-dev
82-
pytest --capture=no --strict scripts
83-
displayName: 'Testing docstring validation script'
84-
condition: true
85-
86-
- script: |
87-
source activate pandas-dev
88-
cd asv_bench
89-
asv check -E existing
90-
git remote add upstream https://github.com/pandas-dev/pandas.git
91-
git fetch upstream
92-
if git diff upstream/master --name-only | grep -q "^asv_bench/"; then
93-
asv machine --yes
94-
ASV_OUTPUT="$(asv dev)"
95-
if [[ $(echo "$ASV_OUTPUT" | grep "failed") ]]; then
96-
echo "##vso[task.logissue type=error]Benchmarks run with errors"
97-
echo "$ASV_OUTPUT"
98-
exit 1
99-
else
100-
echo "Benchmarks run without errors"
101-
fi
102-
else
103-
echo "Benchmarks did not run, no changes detected"
104-
fi
105-
displayName: 'Running benchmarks'
106-
condition: true
107-
10819
- job: 'Web_and_Docs'
10920
pool:
11021
vmImage: ubuntu-16.04

ci/azure/posix.yml

Lines changed: 7 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -44,16 +44,13 @@ jobs:
4444
PATTERN: "not slow and not network"
4545
LOCALE_OVERRIDE: "zh_CN.UTF-8"
4646

47-
# https://github.com/pandas-dev/pandas/issues/29432
48-
# py37_np_dev:
49-
# ENV_FILE: ci/deps/azure-37-numpydev.yaml
50-
# CONDA_PY: "37"
51-
# PATTERN: "not slow and not network"
52-
# TEST_ARGS: "-W error"
53-
# PANDAS_TESTING_MODE: "deprecate"
54-
# EXTRA_APT: "xsel"
55-
# # TODO:
56-
# continueOnError: true
47+
py37_np_dev:
48+
ENV_FILE: ci/deps/azure-37-numpydev.yaml
49+
CONDA_PY: "37"
50+
PATTERN: "not slow and not network"
51+
TEST_ARGS: "-W error"
52+
PANDAS_TESTING_MODE: "deprecate"
53+
EXTRA_APT: "xsel"
5754

5855
steps:
5956
- script: |

ci/deps/azure-macos-36.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,9 +20,9 @@ dependencies:
2020
- matplotlib=2.2.3
2121
- nomkl
2222
- numexpr
23-
- numpy=1.13.3
23+
- numpy=1.14
2424
- openpyxl
25-
- pyarrow
25+
- pyarrow>=0.12.0
2626
- pytables
2727
- python-dateutil==2.6.1
2828
- pytz

ci/deps/azure-windows-36.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ dependencies:
2020
- numexpr
2121
- numpy=1.15.*
2222
- openpyxl
23-
- pyarrow
23+
- pyarrow>=0.12.0
2424
- pytables
2525
- python-dateutil
2626
- pytz

doc/redirects.csv

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -828,7 +828,6 @@ generated/pandas.MultiIndex.sortlevel,../reference/api/pandas.MultiIndex.sortlev
828828
generated/pandas.MultiIndex.swaplevel,../reference/api/pandas.MultiIndex.swaplevel
829829
generated/pandas.MultiIndex.to_flat_index,../reference/api/pandas.MultiIndex.to_flat_index
830830
generated/pandas.MultiIndex.to_frame,../reference/api/pandas.MultiIndex.to_frame
831-
generated/pandas.MultiIndex.to_hierarchical,../reference/api/pandas.MultiIndex.to_hierarchical
832831
generated/pandas.notna,../reference/api/pandas.notna
833832
generated/pandas.notnull,../reference/api/pandas.notnull
834833
generated/pandas.option_context,../reference/api/pandas.option_context

doc/source/development/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,3 +19,4 @@ Development
1919
developer
2020
policies
2121
roadmap
22+
meeting

doc/source/development/meeting.rst

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
.. _meeting:
2+
3+
==================
4+
Developer Meetings
5+
==================
6+
7+
We hold regular developer meetings on the second Wednesday
8+
of each month at 18:00 UTC. These meetings and their minutes are open to
9+
the public. All are welcome to join.
10+
11+
Minutes
12+
-------
13+
14+
The minutes of past meetings are available in `this Google Document <https://docs.google.com/document/d/1tGbTiYORHiSPgVMXawiweGJlBw5dOkVJLY-licoBmBU/edit?usp=sharing>`__.
15+
16+
Calendar
17+
--------
18+
19+
This calendar shows all the developer meetings.
20+
21+
.. raw:: html
22+
23+
<iframe src="https://calendar.google.com/calendar/embed?src=pgbn14p6poja8a1cf2dv2jhrmg%40group.calendar.google.com" style="border: 0" width="800" height="600" frameborder="0" scrolling="no"></iframe>
24+
25+
You can subscribe to this calendar with the following links:
26+
27+
* `iCal <https://calendar.google.com/calendar/ical/pgbn14p6poja8a1cf2dv2jhrmg%40group.calendar.google.com/public/basic.ics>`__
28+
* `Google calendar <https://calendar.google.com/calendar/embed?src=pgbn14p6poja8a1cf2dv2jhrmg%40group.calendar.google.com>`__
29+
30+
Additionally, we'll sometimes have one-off meetings on specific topics.
31+
These will be published on the same calendar.
32+

doc/source/getting_started/basics.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1950,6 +1950,7 @@ sparse :class:`SparseDtype` (none) :class:`arrays.
19501950
intervals :class:`IntervalDtype` :class:`Interval` :class:`arrays.IntervalArray` :ref:`advanced.intervalindex`
19511951
nullable integer :class:`Int64Dtype`, ... (none) :class:`arrays.IntegerArray` :ref:`integer_na`
19521952
Strings :class:`StringDtype` :class:`str` :class:`arrays.StringArray` :ref:`text`
1953+
Boolean (with NA) :class:`BooleanDtype` :class:`bool` :class:`arrays.BooleanArray` :ref:`api.arrays.bool`
19531954
=================== ========================= ================== ============================= =============================
19541955

19551956
Pandas has two ways to store strings.

doc/source/getting_started/install.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -258,7 +258,7 @@ matplotlib 2.2.2 Visualization
258258
openpyxl 2.4.8 Reading / writing for xlsx files
259259
pandas-gbq 0.8.0 Google Big Query access
260260
psycopg2 PostgreSQL engine for sqlalchemy
261-
pyarrow 0.9.0 Parquet and feather reading / writing
261+
pyarrow 0.12.0 Parquet and feather reading / writing
262262
pymysql 0.7.11 MySQL engine for sqlalchemy
263263
pyreadstat SPSS files (.sav) reading
264264
pytables 3.4.2 HDF5 reading / writing

doc/source/reference/arrays.rst

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ Nullable Integer :class:`Int64Dtype`, ... (none) :ref:`api.array
2525
Categorical :class:`CategoricalDtype` (none) :ref:`api.arrays.categorical`
2626
Sparse :class:`SparseDtype` (none) :ref:`api.arrays.sparse`
2727
Strings :class:`StringDtype` :class:`str` :ref:`api.arrays.string`
28+
Boolean (with NA) :class:`BooleanDtype` :class:`bool` :ref:`api.arrays.bool`
2829
=================== ========================= ================== =============================
2930

3031
Pandas and third-party libraries can extend NumPy's type system (see :ref:`extending.extension-types`).
@@ -485,6 +486,28 @@ The ``Series.str`` accessor is available for ``Series`` backed by a :class:`arra
485486
See :ref:`api.series.str` for more.
486487

487488

489+
.. _api.arrays.bool:
490+
491+
Boolean data with missing values
492+
--------------------------------
493+
494+
The boolean dtype (with the alias ``"boolean"``) provides support for storing
495+
boolean data (True, False values) with missing values, which is not possible
496+
with a bool :class:`numpy.ndarray`.
497+
498+
.. autosummary::
499+
:toctree: api/
500+
:template: autosummary/class_without_autosummary.rst
501+
502+
arrays.BooleanArray
503+
504+
.. autosummary::
505+
:toctree: api/
506+
:template: autosummary/class_without_autosummary.rst
507+
508+
BooleanDtype
509+
510+
488511
.. Dtype attributes which are manually listed in their docstrings: including
489512
.. it here to make sure a docstring page is built for them
490513

doc/source/reference/indexing.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -305,7 +305,6 @@ MultiIndex components
305305

306306
MultiIndex.set_levels
307307
MultiIndex.set_codes
308-
MultiIndex.to_hierarchical
309308
MultiIndex.to_flat_index
310309
MultiIndex.to_frame
311310
MultiIndex.is_lexsorted

doc/source/reference/style.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ Style application
4141
Styler.set_caption
4242
Styler.set_properties
4343
Styler.set_uuid
44+
Styler.set_na_rep
4445
Styler.clear
4546
Styler.pipe
4647

doc/source/user_guide/scale.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -93,9 +93,9 @@ Use efficient datatypes
9393
-----------------------
9494

9595
The default pandas data types are not the most memory efficient. This is
96-
especially true for high-cardinality text data (columns with relatively few
97-
unique values). By using more efficient data types you can store larger datasets
98-
in memory.
96+
especially true for text data columns with relatively few unique values (commonly
97+
referred to as "low-cardinality" data). By using more efficient data types, you
98+
can store larger datasets in memory.
9999

100100
.. ipython:: python
101101

0 commit comments

Comments
 (0)