Skip to content

DOC: update the docstring of pandas.DataFrame.to_dict #20162

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 13, 2018

Conversation

Cheukting
Copy link
Contributor

@Cheukting Cheukting commented Mar 10, 2018

Checklist for the pandas documentation sprint (ignore this if you are doing
an unrelated PR):

  • PR title is "DOC: update the docstring"
  • The validation script passes: scripts/validate_docstrings.py <your-function-or-method>
  • The PEP8 style check passes: git diff upstream/master -u -- "*.py" | flake8 --diff
  • The html version looks good: python doc/make.py --single <your-function-or-method>
  • It has been proofread on language by another sprint participant

Please include the output of the validation script below between the "```" ticks:

################################################################################

##################### Docstring (pandas.DataFrame.to_dict) #####################

################################################################################


Convert DataFrame to dictionary.

Method converting the DataFrame to a Python dictionary,
the type of the key-value pairs can be customized with
the parameters (see below).

Parameters
----------
orient : str {'dict', 'list', 'series', 'split', 'records', 'index'}
    Determines the type of the values of the dictionary.

    - dict (default) : dict like {column -> {index -> value}}
    - list : dict like {column -> [values]}
    - series : dict like {column -> Series(values)}
    - split : dict like
      {index -> [index], columns -> [columns], data -> [values]}
    - records : list like
      [{column -> value}, ... , {column -> value}]
    - index : dict like {index -> {column -> value}}

    Abbreviations are allowed. `s` indicates `series` and `sp`
    indicates `split`.

into : class, default dict
    The collections.Mapping subclass used for all Mappings
    in the return value.  Can be the actual class or an empty
    instance of the mapping type you want.  If you want a
    collections.defaultdict, you must pass it initialized.

    .. versionadded:: 0.21.0.

Returns
-------
result : collections.Mapping like {column -> {index -> value}}

See Also
--------
from_dict: create a DataFrame from a dictionary

Examples
--------
>>> df = pd.DataFrame({'col1': [1, 2],
...                    'col2': [0.5, 0.75]},
...                   index=['a', 'b'])
>>> df
   col1  col2
a     1   0.50
b     2   0.75
>>> df.to_dict()
{'col1': {'a': 1, 'b': 2}, 'col2': {'a': 0.5, 'b': 0.75}}

You can specify the return orientation.

>>> df.to_dict('series')
{'col1': a    1
b    2
Name: col1, dtype: int64, 'col2': a    0.50
b    0.75
Name: col2, dtype: float64}
>>> df.to_dict('split')
{'index': ['a', 'b'], 'columns': ['col1', 'col2'],
'data': [[1.0, 0.5], [2.0, 0.75]]}
>>> df.to_dict('records')
[{'col1': 1.0, 'col2': 0.5}, {'col1': 2.0, 'col2': 0.75}]
>>> df.to_dict('index')
{'a': {'col1': 1.0, 'col2': 0.5}, 'b': {'col1': 2.0, 'col2': 0.75}}

You can also specify the mapping type.

>>> from collections import OrderedDict, defaultdict
>>> df.to_dict(into=OrderedDict)
OrderedDict([('col1', OrderedDict([('a', 1), ('b', 2)])),
           ('col2', OrderedDict([('a', 0.5), ('b', 0.75)]))])

If you want a `defaultdict`, you need to initialize it:

>>> dd = defaultdict(list)
>>> df.to_dict('records', into=dd)
[defaultdict(<class 'list'>, {'col1': 1.0, 'col2': 0.5}),
defaultdict(<class 'list'>, {'col1': 2.0, 'col2': 0.75})]

################################################################################

################################## Validation ##################################

################################################################################


Docstring for "pandas.DataFrame.to_dict" correct. :)

If the validation script still gives errors, but you think there is a good reason
to deviate in this case (and there are certainly such cases), please state this
explicitly.

@pep8speaks
Copy link

pep8speaks commented Mar 10, 2018

Hello @Cheukting! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on March 13, 2018 at 19:21 Hours UTC

('col2', OrderedDict([('a', 0.5), ('b', 0.75)]))])

If you want a `defaultdict`, you need to initialize it:

>>> dd = defaultdict(list)
>>> df.to_dict('records', into=dd)
[defaultdict(<type 'list'>, {'col2': 0.5, 'col1': 1.0}),
defaultdict(<type 'list'>, {'col2': 0.75, 'col1': 2.0})]
[defaultdict(<class 'list'>, {'col1': 1.0, 'col2': 0.5}),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that having the output directly below the first > is correct. Not a big deal thoughl.

@jbigatti
Copy link

Please change PR's title to: "DOC: update the pandas.DataFrame.to_dict docstring"

Examples
--------
>>> df = pd.DataFrame(
{'col1': [1, 2], 'col2': [0.5, 0.75]}, index=['a', 'b'])
>>> df = pd.DataFrame({'col1':[1, 2],'col2':[0.5, 0.75]},index=['a', 'b'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

,'col2' -> , 'col2'

@Cheukting Cheukting changed the title DOC: Improved the docstring of pandas.DataFrame.to_dict DOC: update the docstring of pandas.DataFrame.to_dict Mar 10, 2018
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small comments. Looks good

In the list of 'orient' options, can you also quote the options? (like 'dict' (default) (to make clear it is a string)


Returns
-------
result : collections.Mapping like {column -> {index -> value}}

See Also
--------
from_dict: create a DataFrame from a dictionary
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from_dict -> DataFrame.from_dict

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, couple of minor things.

@@ -949,16 +954,21 @@ def to_dict(self, orient='dict', into=dict):
instance of the mapping type you want. If you want a
collections.defaultdict, you must pass it initialized.

.. versionadded:: 0.21.0
.. versionadded:: 0.21.0.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation script is wrong, you don't need the final dot here.

See Also
--------
DataFrame.from_dict: create a DataFrame from a dictionary

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing if it'd make sense to also add to_json here. I think users checking this method can be looking for that.


Method converting the DataFrame to a Python dictionary,
the type of the key-value pairs can be customized with
the parameters (see below).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit repetitive, I wouldn't repeat that it converts to a dictionary again, and directly explain that you can customize.

- 'list' : dict like {column -> [values]}
- 'series' : dict like {column -> Series(values)}
- 'split' : dict like
{index' -> [index], columns -> [columns], data -> [values]}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you on this line do {'index' -> [index], 'columns' -> [columns], 'data' -> [values]} (as in this case those are actual strings and not a placeholder for what is in the data)

@TomAugspurger TomAugspurger added this to the 0.23.0 milestone Mar 13, 2018
@TomAugspurger
Copy link
Contributor

Cleaned up the example output formatting a tad.

Thanks @Cheukting !

@TomAugspurger TomAugspurger merged commit 79200d3 into pandas-dev:master Mar 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants