Description
tl;dr
Fix the next doctest error:
_________________________________________________________ [doctest] pandas.io.formats.format.EngFormatter.__call__ __________________________________________________________
1956
1957 Formats a number in engineering notation, appending a letter
1958 representing the power of 1000 of the original number. Some examples:
1959
1960 >>> format_eng(0) # for self.accuracy = 0
UNEXPECTED EXCEPTION: NameError("name 'format_eng' is not defined")
Traceback (most recent call last):
File "/home/mgarcia/miniconda3/envs/pandas-dev/lib/python3.8/doctest.py", line 1336, in __run
exec(compile(example.source, filename, "single",
File "<doctest pandas.io.formats.format.EngFormatter.__call__[0]>", line 1, in <module>
NameError: name 'format_eng' is not defined
/home/mgarcia/src/pandas/pandas/io/formats/format.py:1960: UnexpectedException
Detailed instructions
Python allows to have example code in the documentation, like in:
def add(num1, num2):
"""
Computes the sum of the two numbers.
Examples
--------
>>> add(2, 2)
4
"""
return num1 + num2
In pandas, we use this to document most elements. And there are tools, like pytest,
that can run the examples, and make sure everything is correct.
For historical reasons, we have many examples where the code fails to run, or the
actual output is different from the expected output. For example, check the next
incorrect examples:
def add(num1, num2):
"""
Computes the sum of the two numbers.
Examples
--------
>>> add(2, 2)
5
>>> add(2, 2
4
>>> add(2, number)
4
...
"""
return num1 + num2
All them will fail for different reasons. To test the docstring of an object,
the next command can be run:
python -m pytest --doctest-modules pandas/core/frame.py::pandas.core.frame.DataFrame.info
Where pandas/core/frame.py
is the file where the docstring is defined, and
pandas.core.frame.DataFrame.info
is the object. A whole file can also be tested
by removing the ::
and the object from the command above.
In general, the errors in the examples can be fixed with things like:
- Fixing a typo (a missing comma, an mispelled variable name...)
- Adding an object that hasn't been defined (like, if
df
is used, but
no sample datasetdf
has been first defined) - Fixing the expected output, when it's wrong
- In exceptional cases, examples shouldn't run, since they can't work.
For example, a function that connects to a private webservice. In
such cases, we can add# doctest: +SKIP
at the end of the lines
that should not run
To be able to properly fix an example for the first time, the next steps
are needed:
- Install a pandas development environment in your computer. There are
simplified instructions in this page,
and more detailed information in pandas official contributing page. - Run the doctests for the object of interest (the one in this issue),
and make sure the examples are still broken in themaster
branch of
pandas - Fix the file locally, and run the doctests again, to make sure the
fix is working as expected - Optionally have a look and make sure that the code in the examples
follow PEP-8, and fix the style if it doesn't - Commit your changes, push your branch to a fork, and open a pull
request. Make sure you edit the lineCloses #XXXX
with the issue
number you are addressing, so the issue is automatically closed,
when the pull request is merged - Make sure the continuous integration of your pull request finishes
in green. If it doesn't, check if the problem is in your changes
(sometimes things break in master for technical problems, and in
that case you just need to wait for a core developer to fix the
problem) - Address any comment from the reviewers (just make changes locally,
commit, and push to your branch, no need to open new pull requests)