Skip to content

Commit b0f5baa

Browse files
committed
Merge remote-tracking branch 'upstream/master' into series_rolling_count_ignores_min_periods
2 parents c4878b0 + 7d28040 commit b0f5baa

File tree

12 files changed

+131
-55
lines changed

12 files changed

+131
-55
lines changed

.devcontainer.json

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
// For format details, see https://aka.ms/vscode-remote/devcontainer.json or the definition README at
2+
// https://github.com/microsoft/vscode-dev-containers/tree/master/containers/python-3-miniconda
3+
{
4+
"name": "pandas",
5+
"context": ".",
6+
"dockerFile": "Dockerfile",
7+
8+
// Use 'settings' to set *default* container specific settings.json values on container create.
9+
// You can edit these settings after create using File > Preferences > Settings > Remote.
10+
"settings": {
11+
"terminal.integrated.shell.linux": "/bin/bash",
12+
"python.condaPath": "/opt/conda/bin/conda",
13+
"python.pythonPath": "/opt/conda/bin/python",
14+
"python.formatting.provider": "black",
15+
"python.linting.enabled": true,
16+
"python.linting.flake8Enabled": true,
17+
"python.linting.pylintEnabled": false,
18+
"python.linting.mypyEnabled": true,
19+
"python.testing.pytestEnabled": true,
20+
"python.testing.cwd": "pandas/tests"
21+
},
22+
23+
// Add the IDs of extensions you want installed when the container is created in the array below.
24+
"extensions": [
25+
"ms-python.python",
26+
"ms-vscode.cpptools"
27+
]
28+
}

Dockerfile

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
FROM continuumio/miniconda3
2+
3+
# if you forked pandas, you can pass in your own GitHub username to use your fork
4+
# i.e. gh_username=myname
5+
ARG gh_username=pandas-dev
6+
ARG pandas_home="/home/pandas"
7+
8+
# Avoid warnings by switching to noninteractive
9+
ENV DEBIAN_FRONTEND=noninteractive
10+
11+
# Configure apt and install packages
12+
RUN apt-get update \
13+
&& apt-get -y install --no-install-recommends apt-utils dialog 2>&1 \
14+
#
15+
# Verify git, process tools, lsb-release (common in install instructions for CLIs) installed
16+
&& apt-get -y install git iproute2 procps iproute2 lsb-release \
17+
#
18+
# Install C compilers (gcc not enough, so just went with build-essential which admittedly might be overkill),
19+
# needed to build pandas C extensions
20+
&& apt-get -y install build-essential \
21+
#
22+
# cleanup
23+
&& apt-get autoremove -y \
24+
&& apt-get clean -y \
25+
&& rm -rf /var/lib/apt/lists/*
26+
27+
# Switch back to dialog for any ad-hoc use of apt-get
28+
ENV DEBIAN_FRONTEND=dialog
29+
30+
# Clone pandas repo
31+
RUN mkdir "$pandas_home" \
32+
&& git clone "https://github.com/$gh_username/pandas.git" "$pandas_home" \
33+
&& cd "$pandas_home" \
34+
&& git remote add upstream "https://github.com/pandas-dev/pandas.git" \
35+
&& git pull upstream master
36+
37+
# Because it is surprisingly difficult to activate a conda environment inside a DockerFile
38+
# (from personal experience and per https://github.com/ContinuumIO/docker-images/issues/89),
39+
# we just update the base/root one from the 'environment.yml' file instead of creating a new one.
40+
#
41+
# Set up environment
42+
RUN conda env update -n base -f "$pandas_home/environment.yml"
43+
44+
# Build C extensions and pandas
45+
RUN cd "$pandas_home" \
46+
&& python setup.py build_ext --inplace -j 4 \
47+
&& python -m pip install -e .

doc/source/development/contributing.rst

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,17 @@ requires a C compiler and Python environment. If you're making documentation
146146
changes, you can skip to :ref:`contributing.documentation` but you won't be able
147147
to build the documentation locally before pushing your changes.
148148

149+
Using a Docker Container
150+
~~~~~~~~~~~~~~~~~~~~~~~~
151+
152+
Instead of manually setting up a development environment, you can use Docker to
153+
automatically create the environment with just several commands. Pandas provides a `DockerFile`
154+
in the root directory to build a Docker image with a full pandas development environment.
155+
156+
Even easier, you can use the DockerFile to launch a remote session with Visual Studio Code,
157+
a popular free IDE, using the `.devcontainer.json` file.
158+
See https://code.visualstudio.com/docs/remote/containers for details.
159+
149160
.. _contributing.dev_c:
150161

151162
Installing a C compiler

doc/source/ecosystem.rst

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -122,16 +122,14 @@ also goes beyond matplotlib and pandas with the option to perform statistical
122122
estimation while plotting, aggregating across observations and visualizing the
123123
fit of statistical models to emphasize patterns in a dataset.
124124

125-
`yhat/ggpy <https://github.com/yhat/ggpy>`__
126-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
125+
`plotnine <https://github.com/has2k1/plotnine/>`__
126+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
127127

128128
Hadley Wickham's `ggplot2 <https://ggplot2.tidyverse.org/>`__ is a foundational exploratory visualization package for the R language.
129129
Based on `"The Grammar of Graphics" <https://www.cs.uic.edu/~wilkinson/TheGrammarOfGraphics/GOG.html>`__ it
130130
provides a powerful, declarative and extremely general way to generate bespoke plots of any kind of data.
131-
It's really quite incredible. Various implementations to other languages are available,
132-
but a faithful implementation for Python users has long been missing. Although still young
133-
(as of Jan-2014), the `yhat/ggpy <https://github.com/yhat/ggpy>`__ project has been
134-
progressing quickly in that direction.
131+
Various implementations to other languages are available.
132+
A good implementation for Python users is `has2k1/plotnine <https://github.com/has2k1/plotnine/>`__.
135133

136134
`IPython Vega <https://github.com/vega/ipyvega>`__
137135
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

pandas/core/indexes/base.py

Lines changed: 9 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -4639,7 +4639,8 @@ def get_value(self, series, key):
46394639

46404640
k = self._convert_scalar_indexer(key, kind="getitem")
46414641
try:
4642-
return self._engine.get_value(s, k, tz=getattr(series.dtype, "tz", None))
4642+
loc = self._engine.get_loc(k)
4643+
46434644
except KeyError as e1:
46444645
if len(self) > 0 and (self.holds_integer() or self.is_boolean()):
46454646
raise
@@ -4648,19 +4649,17 @@ def get_value(self, series, key):
46484649
return libindex.get_value_at(s, key)
46494650
except IndexError:
46504651
raise
4651-
except TypeError:
4652-
# generator/iterator-like
4653-
if is_iterator(key):
4654-
raise InvalidIndexError(key)
4655-
else:
4656-
raise e1
46574652
except Exception:
46584653
raise e1
46594654
except TypeError:
46604655
# e.g. "[False] is an invalid key"
4661-
if is_scalar(key):
4662-
raise IndexError(key)
4663-
raise InvalidIndexError(key)
4656+
raise IndexError(key)
4657+
4658+
else:
4659+
if is_scalar(loc):
4660+
tz = getattr(series.dtype, "tz", None)
4661+
return libindex.get_value_at(s, loc, tz=tz)
4662+
return series.iloc[loc]
46644663

46654664
def set_value(self, arr, key, value):
46664665
"""

pandas/core/indexes/numeric.py

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -488,17 +488,18 @@ def __contains__(self, other) -> bool:
488488

489489
@Appender(_index_shared_docs["get_loc"])
490490
def get_loc(self, key, method=None, tolerance=None):
491-
try:
492-
if np.all(np.isnan(key)) or is_bool(key):
493-
nan_idxs = self._nan_idxs
494-
try:
495-
return nan_idxs.item()
496-
except ValueError:
497-
if not len(nan_idxs):
498-
raise KeyError(key)
499-
return nan_idxs
500-
except (TypeError, NotImplementedError):
501-
pass
491+
if is_bool(key):
492+
# Catch this to avoid accidentally casting to 1.0
493+
raise KeyError(key)
494+
495+
if is_float(key) and np.isnan(key):
496+
nan_idxs = self._nan_idxs
497+
if not len(nan_idxs):
498+
raise KeyError(key)
499+
elif len(nan_idxs) == 1:
500+
return nan_idxs[0]
501+
return nan_idxs
502+
502503
return super().get_loc(key, method=method, tolerance=tolerance)
503504

504505
@cache_readonly

pandas/core/series.py

Lines changed: 0 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -815,18 +815,6 @@ def __getitem__(self, key):
815815
try:
816816
result = self.index.get_value(self, key)
817817

818-
if not is_scalar(result):
819-
if is_list_like(result) and not isinstance(result, Series):
820-
821-
# we need to box if loc of the key isn't scalar here
822-
# otherwise have inline ndarray/lists
823-
try:
824-
if not is_scalar(self.index.get_loc(key)):
825-
result = self._constructor(
826-
result, index=[key] * len(result), dtype=self.dtype
827-
).__finalize__(self)
828-
except KeyError:
829-
pass
830818
return result
831819
except InvalidIndexError:
832820
pass

pandas/io/html.py

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -591,9 +591,14 @@ def _setup_build_doc(self):
591591
def _build_doc(self):
592592
from bs4 import BeautifulSoup
593593

594-
return BeautifulSoup(
595-
self._setup_build_doc(), features="html5lib", from_encoding=self.encoding
596-
)
594+
bdoc = self._setup_build_doc()
595+
if isinstance(bdoc, bytes) and self.encoding is not None:
596+
udoc = bdoc.decode(self.encoding)
597+
from_encoding = None
598+
else:
599+
udoc = bdoc
600+
from_encoding = self.encoding
601+
return BeautifulSoup(udoc, features="html5lib", from_encoding=from_encoding)
597602

598603

599604
def _build_xpath_expr(attrs) -> str:

pandas/tests/indexes/multi/test_indexing.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -396,7 +396,8 @@ def test_get_loc_missing_nan():
396396
idx.get_loc(3)
397397
with pytest.raises(KeyError, match=r"^nan$"):
398398
idx.get_loc(np.nan)
399-
with pytest.raises(KeyError, match=r"^\[nan\]$"):
399+
with pytest.raises(TypeError, match=r"'\[nan\]' is an invalid key"):
400+
# listlike/non-hashable raises TypeError
400401
idx.get_loc([np.nan])
401402

402403

pandas/tests/indexes/test_numeric.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -389,7 +389,8 @@ def test_get_loc_missing_nan(self):
389389
idx.get_loc(3)
390390
with pytest.raises(KeyError, match="^nan$"):
391391
idx.get_loc(np.nan)
392-
with pytest.raises(KeyError, match=r"^\[nan\]$"):
392+
with pytest.raises(TypeError, match=r"'\[nan\]' is an invalid key"):
393+
# listlike/non-hashable raises TypeError
393394
idx.get_loc([np.nan])
394395

395396
def test_contains_nans(self):

pandas/tests/io/test_html.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1158,9 +1158,9 @@ def test_displayed_only(self, displayed_only, exp0, exp1):
11581158
assert len(dfs) == 1 # Should not parse hidden table
11591159

11601160
def test_encode(self, html_encoding_file):
1161-
_, encoding = os.path.splitext(os.path.basename(html_encoding_file))[0].split(
1162-
"_"
1163-
)
1161+
base_path = os.path.basename(html_encoding_file)
1162+
root = os.path.splitext(base_path)[0]
1163+
_, encoding = root.split("_")
11641164

11651165
try:
11661166
with open(html_encoding_file, "rb") as fobj:
@@ -1183,7 +1183,7 @@ def test_encode(self, html_encoding_file):
11831183
if is_platform_windows():
11841184
if "16" in encoding or "32" in encoding:
11851185
pytest.skip()
1186-
raise
1186+
raise
11871187

11881188
def test_parse_failure_unseekable(self):
11891189
# Issue #17975

web/pandas/community/ecosystem.md

Lines changed: 4 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -84,19 +84,16 @@ pandas with the option to perform statistical estimation while plotting,
8484
aggregating across observations and visualizing the fit of statistical
8585
models to emphasize patterns in a dataset.
8686

87-
### [yhat/ggpy](https://github.com/yhat/ggpy)
87+
### [plotnine](https://github.com/has2k1/plotnine/)
8888

8989
Hadley Wickham's [ggplot2](https://ggplot2.tidyverse.org/) is a
9090
foundational exploratory visualization package for the R language. Based
9191
on ["The Grammar of
9292
Graphics"](https://www.cs.uic.edu/~wilkinson/TheGrammarOfGraphics/GOG.html)
9393
it provides a powerful, declarative and extremely general way to
94-
generate bespoke plots of any kind of data. It's really quite
95-
incredible. Various implementations to other languages are available,
96-
but a faithful implementation for Python users has long been missing.
97-
Although still young (as of Jan-2014), the
98-
[yhat/ggpy](https://github.com/yhat/ggpy) project has been progressing
99-
quickly in that direction.
94+
generate bespoke plots of any kind of data.
95+
Various implementations to other languages are available.
96+
A good implementation for Python users is [has2k1/plotnine](https://github.com/has2k1/plotnine/).
10097

10198
### [IPython Vega](https://github.com/vega/ipyvega)
10299

0 commit comments

Comments
 (0)