Skip to content

v0.16.1 docs #10101

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 14, 2015
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/basics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -236,7 +236,7 @@ see :ref:`here<indexing.boolean>`
Boolean Reductions
~~~~~~~~~~~~~~~~~~

You can apply the reductions: :attr:`~DataFrame.empty`, :meth:`~DataFrame.any`,
You can apply the reductions: :attr:`~DataFrame.empty`, :meth:`~DataFrame.any`,
:meth:`~DataFrame.all`, and :meth:`~DataFrame.bool` to provide a
way to summarize a boolean result.

Expand Down
23 changes: 13 additions & 10 deletions doc/source/categorical.rst
Original file line number Diff line number Diff line change
Expand Up @@ -813,12 +813,16 @@ basic type) and applying along columns will also convert to object.
df.apply(lambda row: type(row["cats"]), axis=1)
df.apply(lambda col: col.dtype, axis=0)

No Categorical Index
~~~~~~~~~~~~~~~~~~~~
Categorical Index
~~~~~~~~~~~~~~~~~

.. versionadded:: 0.16.1

A new ``CategoricalIndex`` index type is introduced in version 0.16.1. See the
:ref:`advanced indexing docs <indexing.categoricalindex>` for a more detailed
explanation.

There is currently no index of type ``category``, so setting the index to categorical column will
convert the categorical data to a "normal" dtype first and therefore remove any custom
ordering of the categories:
Setting the index, will create create a ``CategoricalIndex``

.. ipython:: python

Expand All @@ -827,13 +831,12 @@ ordering of the categories:
values = [4,2,3,1]
df = DataFrame({"strings":strings, "values":values}, index=cats)
df.index
# This should sort by categories but does not as there is no CategoricalIndex!
# This now sorts by the categories order
df.sort_index()

.. note::
This could change if a `CategoricalIndex` is implemented (see
https://github.com/pydata/pandas/issues/7629)

In previous versions (<0.16.1) there is no index of type ``category``, so
setting the index to categorical column will convert the categorical data to a
"normal" dtype first and therefore remove any custom ordering of the categories.

Side Effects
~~~~~~~~~~~~
Expand Down
10 changes: 5 additions & 5 deletions doc/source/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -113,10 +113,10 @@ This creates the directory `pandas-yourname` and connects your repository to
the upstream (main project) *pandas* repository.

The testing suite will run automatically on Travis-CI once your Pull Request is
submitted. However, if you wish to run the test suite on a branch prior to
submitted. However, if you wish to run the test suite on a branch prior to
submitting the Pull Request, then Travis-CI needs to be hooked up to your
GitHub repository. Instructions are for doing so are `here
<http://about.travis-ci.org/docs/user/getting-started/>`_.
<http://about.travis-ci.org/docs/user/getting-started/>`__.

Creating a Branch
-----------------
Expand Down Expand Up @@ -219,7 +219,7 @@ To return to you home root environment:
deactivate

See the full ``conda`` docs `here
<http://conda.pydata.org/docs>`_.
<http://conda.pydata.org/docs>`__.

At this point you can easily do an *in-place* install, as detailed in the next section.

Expand Down Expand Up @@ -372,7 +372,7 @@ If you want to do a full clean build, do::
Starting with 0.13.1 you can tell ``make.py`` to compile only a single section
of the docs, greatly reducing the turn-around time for checking your changes.
You will be prompted to delete `.rst` files that aren't required. This is okay
since the prior version can be checked out from git, but make sure to
since the prior version can be checked out from git, but make sure to
not commit the file deletions.

::
Expand Down Expand Up @@ -401,7 +401,7 @@ Built Master Branch Documentation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

When pull-requests are merged into the pandas *master* branch, the main parts of the documentation are
also built by Travis-CI. These docs are then hosted `here <http://pandas-docs.github.io/pandas-docs-travis>`_.
also built by Travis-CI. These docs are then hosted `here <http://pandas-docs.github.io/pandas-docs-travis>`__.

Contributing to the code base
=============================
Expand Down
4 changes: 2 additions & 2 deletions doc/source/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ pandas at all.
Simply create an account, and have access to pandas from within your brower via
an `IPython Notebook <http://ipython.org/notebook.html>`__ in a few minutes.

.. _install.anaconda
.. _install.anaconda:

Installing pandas with Anaconda
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -68,7 +68,7 @@ admin rights to install it, it will install in the user's home directory, and
this also makes it trivial to delete Anaconda at a later date (just delete
that folder).

.. _install.miniconda
.. _install.miniconda:

Installing pandas with Miniconda
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
4 changes: 2 additions & 2 deletions doc/source/text.rst
Original file line number Diff line number Diff line change
Expand Up @@ -82,11 +82,11 @@ Elements in the split lists can be accessed using ``get`` or ``[]`` notation:
s2.str.split('_').str.get(1)
s2.str.split('_').str[1]

Easy to expand this to return a DataFrame using ``return_type``.
Easy to expand this to return a DataFrame using ``expand``.

.. ipython:: python

s2.str.split('_', return_type='frame')
s2.str.split('_', expand=True)

Methods like ``replace`` and ``findall`` take `regular expressions
<https://docs.python.org/2/library/re.html>`__, too:
Expand Down
4 changes: 2 additions & 2 deletions doc/source/visualization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -220,8 +220,8 @@ Histogram can be drawn specifying ``kind='hist'``.

.. ipython:: python

df4 = pd.DataFrame({'a': randn(1000) + 1, 'b': randn(1000),
'c': randn(1000) - 1}, columns=['a', 'b', 'c'])
df4 = pd.DataFrame({'a': np.random.randn(1000) + 1, 'b': np.random.randn(1000),
'c': np.random.randn(1000) - 1}, columns=['a', 'b', 'c'])

plt.figure();

Expand Down
169 changes: 74 additions & 95 deletions doc/source/whatsnew/v0.16.1.txt
Original file line number Diff line number Diff line change
Expand Up @@ -31,44 +31,6 @@ Highlights include:
Enhancements
~~~~~~~~~~~~

- ``BusinessHour`` offset is now supported, which represents business hours starting from 09:00 - 17:00 on ``BusinessDay`` by default. See :ref:`Here <timeseries.businesshour>` for details. (:issue:`7905`)

.. ipython:: python

Timestamp('2014-08-01 09:00') + BusinessHour()
Timestamp('2014-08-01 07:00') + BusinessHour()
Timestamp('2014-08-01 16:30') + BusinessHour()

- ``DataFrame.diff`` now takes an ``axis`` parameter that determines the direction of differencing (:issue:`9727`)

- Allow ``clip``, ``clip_lower``, and ``clip_upper`` to accept array-like arguments as thresholds (This is a regression from 0.11.0). These methods now have an ``axis`` parameter which determines how the Series or DataFrame will be aligned with the threshold(s). (:issue:`6966`)

- ``DataFrame.mask()`` and ``Series.mask()`` now support same keywords as ``where`` (:issue:`8801`)

- ``drop`` function can now accept ``errors`` keyword to suppress ``ValueError`` raised when any of label does not exist in the target data. (:issue:`6736`)

.. ipython:: python

df = DataFrame(np.random.randn(3, 3), columns=['A', 'B', 'C'])
df.drop(['A', 'X'], axis=1, errors='ignore')

- Allow conversion of values with dtype ``datetime64`` or ``timedelta64`` to strings using ``astype(str)`` (:issue:`9757`)
- ``get_dummies`` function now accepts ``sparse`` keyword. If set to ``True``, the return ``DataFrame`` is sparse, e.g. ``SparseDataFrame``. (:issue:`8823`)
- ``Period`` now accepts ``datetime64`` as value input. (:issue:`9054`)

- Allow timedelta string conversion when leading zero is missing from time definition, ie `0:00:00` vs `00:00:00`. (:issue:`9570`)
- Allow ``Panel.shift`` with ``axis='items'`` (:issue:`9890`)

- Trying to write an excel file now raises ``NotImplementedError`` if the ``DataFrame`` has a ``MultiIndex`` instead of writing a broken Excel file. (:issue:`9794`)
- Allow ``Categorical.add_categories`` to accept ``Series`` or ``np.array``. (:issue:`9927`)

- Add/delete ``str/dt/cat`` accessors dynamically from ``__dir__``. (:issue:`9910`)
- Add ``normalize`` as a ``dt`` accessor method. (:issue:`10047`)

- ``DataFrame`` and ``Series`` now have ``_constructor_expanddim`` property as overridable constructor for one higher dimensionality data. This should be used only when it is really needed, see :ref:`here <ref-subclassing-pandas>`

- ``pd.lib.infer_dtype`` now returns ``'bytes'`` in Python 3 where appropriate. (:issue:`10032`)

.. _whatsnew_0161.enhancements.categoricalindex:

CategoricalIndex
Expand Down Expand Up @@ -188,16 +150,6 @@ String Methods Enhancements
:ref:`Continuing from v0.16.0 <whatsnew_0160.enhancements.string>`, the following
enhancements make string operations easier and more consistent with standard python string operations.

- The following new methods are accesible via ``.str`` accessor to apply the function to each values. (:issue:`9766`, :issue:`9773`, :issue:`10031`, :issue:`10045`, :issue:`10052`)

================ =============== =============== =============== ================
.. .. Methods .. ..
================ =============== =============== =============== ================
``capitalize()`` ``swapcase()`` ``normalize()`` ``partition()`` ``rpartition()``
``index()`` ``rindex()`` ``translate()``
================ =============== =============== =============== ================



- Added ``StringMethods`` (``.str`` accessor) to ``Index`` (:issue:`9068`)

Expand All @@ -220,6 +172,14 @@ enhancements make string operations easier and more consistent with standard pyt
idx.str.startswith('a')
s[s.index.str.startswith('a')]

- The following new methods are accesible via ``.str`` accessor to apply the function to each values. (:issue:`9766`, :issue:`9773`, :issue:`10031`, :issue:`10045`, :issue:`10052`)

================ =============== =============== =============== ================
.. .. Methods .. ..
================ =============== =============== =============== ================
``capitalize()`` ``swapcase()`` ``normalize()`` ``partition()`` ``rpartition()``
``index()`` ``rindex()`` ``translate()``
================ =============== =============== =============== ================

- ``split`` now takes ``expand`` keyword to specify whether to expand dimensionality. ``return_type`` is deprecated. (:issue:`9847`)

Expand All @@ -244,14 +204,59 @@ enhancements make string operations easier and more consistent with standard pyt

- Improved ``extract`` and ``get_dummies`` methods for ``Index.str`` (:issue:`9980`)

.. _whatsnew_0161.api:

API changes
~~~~~~~~~~~
.. _whatsnew_0161.enhancements.other:

Other Enhancements
^^^^^^^^^^^^^^^^^^

- ``BusinessHour`` offset is now supported, which represents business hours starting from 09:00 - 17:00 on ``BusinessDay`` by default. See :ref:`Here <timeseries.businesshour>` for details. (:issue:`7905`)

.. ipython:: python

from pandas.tseries.offsets import BusinessHour
Timestamp('2014-08-01 09:00') + BusinessHour()
Timestamp('2014-08-01 07:00') + BusinessHour()
Timestamp('2014-08-01 16:30') + BusinessHour()

- ``DataFrame.diff`` now takes an ``axis`` parameter that determines the direction of differencing (:issue:`9727`)

- Allow ``clip``, ``clip_lower``, and ``clip_upper`` to accept array-like arguments as thresholds (This is a regression from 0.11.0). These methods now have an ``axis`` parameter which determines how the Series or DataFrame will be aligned with the threshold(s). (:issue:`6966`)

- ``DataFrame.mask()`` and ``Series.mask()`` now support same keywords as ``where`` (:issue:`8801`)

- ``drop`` function can now accept ``errors`` keyword to suppress ``ValueError`` raised when any of label does not exist in the target data. (:issue:`6736`)

.. ipython:: python

df = DataFrame(np.random.randn(3, 3), columns=['A', 'B', 'C'])
df.drop(['A', 'X'], axis=1, errors='ignore')

- Add support for separating years and quarters using dashes, for
example 2014-Q1. (:issue:`9688`)

- Allow conversion of values with dtype ``datetime64`` or ``timedelta64`` to strings using ``astype(str)`` (:issue:`9757`)
- ``get_dummies`` function now accepts ``sparse`` keyword. If set to ``True``, the return ``DataFrame`` is sparse, e.g. ``SparseDataFrame``. (:issue:`8823`)
- ``Period`` now accepts ``datetime64`` as value input. (:issue:`9054`)

- Allow timedelta string conversion when leading zero is missing from time definition, ie `0:00:00` vs `00:00:00`. (:issue:`9570`)
- Allow ``Panel.shift`` with ``axis='items'`` (:issue:`9890`)

- Trying to write an excel file now raises ``NotImplementedError`` if the ``DataFrame`` has a ``MultiIndex`` instead of writing a broken Excel file. (:issue:`9794`)
- Allow ``Categorical.add_categories`` to accept ``Series`` or ``np.array``. (:issue:`9927`)

- Add/delete ``str/dt/cat`` accessors dynamically from ``__dir__``. (:issue:`9910`)
- Add ``normalize`` as a ``dt`` accessor method. (:issue:`10047`)

- ``DataFrame`` and ``Series`` now have ``_constructor_expanddim`` property as overridable constructor for one higher dimensionality data. This should be used only when it is really needed, see :ref:`here <ref-subclassing-pandas>`

- ``pd.lib.infer_dtype`` now returns ``'bytes'`` in Python 3 where appropriate. (:issue:`10032`)


.. _whatsnew_0161.api:

API changes
~~~~~~~~~~~

- When passing in an ax to ``df.plot( ..., ax=ax)``, the `sharex` kwarg will now default to `False`.
The result is that the visibility of xlabels and xticklabels will not anymore be changed. You
Expand All @@ -260,16 +265,19 @@ API changes
If pandas creates the subplots itself (e.g. no passed in `ax` kwarg), then the
default is still ``sharex=True`` and the visibility changes are applied.



- Add support for separating years and quarters using dashes, for
example 2014-Q1. (:issue:`9688`)

- :meth:`~pandas.DataFrame.assign` now inserts new columns in alphabetical order. Previously
the order was arbitrary. (:issue:`9777`)

- By default, ``read_csv`` and ``read_table`` will now try to infer the compression type based on the file extension. Set ``compression=None`` to restore the previous behavior (no decompression). (:issue:`9770`)

.. _whatsnew_0161.deprecations:

Deprecations
^^^^^^^^^^^^

- ``Series.str.split``'s ``return_type`` keyword was removed in favor of ``expand`` (:issue:`9847`)


.. _whatsnew_0161.index_repr:

Index Representation
Expand Down Expand Up @@ -303,25 +311,17 @@ New Behavior

.. ipython:: python

pd.set_option('display.width',100)
pd.Index(range(4),name='foo')
pd.Index(range(25),name='foo')
pd.Index(range(104),name='foo')
pd.Index(['datetime', 'sA', 'sB', 'sC', 'flow', 'error', 'temp', 'ref', 'a_bit_a_longer_one']*2)
pd.CategoricalIndex(['a','bb','ccc','dddd'],ordered=True,name='foobar')
pd.CategoricalIndex(['a','bb','ccc','dddd']*10,ordered=True,name='foobar')
pd.CategoricalIndex(['a','bb','ccc','dddd']*100,ordered=True,name='foobar')
pd.CategoricalIndex(np.arange(1000),ordered=True,name='foobar')
pd.date_range('20130101',periods=4,name='foo',tz='US/Eastern')
pd.date_range('20130101',periods=25,name='foo',tz='US/Eastern')
pd.date_range('20130101',periods=104,name='foo',tz='US/Eastern')

.. _whatsnew_0161.deprecations:
pd.set_option('display.width', 80)
pd.Index(range(4), name='foo')
pd.Index(range(30), name='foo')
pd.Index(range(104), name='foo')
pd.CategoricalIndex(['a','bb','ccc','dddd'], ordered=True, name='foobar')
pd.CategoricalIndex(['a','bb','ccc','dddd']*10, ordered=True, name='foobar')
pd.CategoricalIndex(['a','bb','ccc','dddd']*100, ordered=True, name='foobar')
pd.date_range('20130101',periods=4, name='foo', tz='US/Eastern')
pd.date_range('20130101',periods=25, freq='D')
pd.date_range('20130101',periods=104, name='foo', tz='US/Eastern')

Deprecations
^^^^^^^^^^^^

- ``Series.str.split``'s ``return_type`` keyword was removed in favor of ``expand`` (:issue:`9847`)

.. _whatsnew_0161.performance:

Expand All @@ -333,7 +333,6 @@ Performance Improvements
- Improved the performance of ``pd.lib.max_len_string_array`` by 5-7x (:issue:`10024`)



.. _whatsnew_0161.bug_fixes:

Bug Fixes
Expand Down Expand Up @@ -361,7 +360,6 @@ Bug Fixes
- Bug where repeated plotting of ``DataFrame`` with a ``DatetimeIndex`` may raise ``TypeError`` (:issue:`9852`)
- Bug in ``setup.py`` that would allow an incompat cython version to build (:issue:`9827`)
- Bug in plotting ``secondary_y`` incorrectly attaches ``right_ax`` property to secondary axes specifying itself recursively. (:issue:`9861`)

- Bug in ``Series.quantile`` on empty Series of type ``Datetime`` or ``Timedelta`` (:issue:`9675`)
- Bug in ``where`` causing incorrect results when upcasting was required (:issue:`9731`)
- Bug in ``FloatArrayFormatter`` where decision boundary for displaying "small" floats in decimal format is off by one order of magnitude for a given display.precision (:issue:`9764`)
Expand All @@ -372,20 +370,13 @@ Bug Fixes
- Bug in index equality comparisons using ``==`` failing on Index/MultiIndex type incompatibility (:issue:`9785`)
- Bug in which ``SparseDataFrame`` could not take `nan` as a column name (:issue:`8822`)
- Bug in ``to_msgpack`` and ``read_msgpack`` zlib and blosc compression support (:issue:`9783`)

- Bug ``GroupBy.size`` doesn't attach index name properly if grouped by ``TimeGrouper`` (:issue:`9925`)
- Bug causing an exception in slice assignments because ``length_of_indexer`` returns wrong results (:issue:`9995`)
- Bug in csv parser causing lines with initial whitespace plus one non-space character to be skipped. (:issue:`9710`)
- Bug in C csv parser causing spurious NaNs when data started with newline followed by whitespace. (:issue:`10022`)

- Bug causing elements with a null group to spill into the final group when grouping by a ``Categorical`` (:issue:`9603`)
- Bug where .iloc and .loc behavior is not consistent on empty dataframes (:issue:`9964`)

- Bug in invalid attribute access on a ``TimedeltaIndex`` incorrectly raised ``ValueError`` instead of ``AttributeError`` (:issue:`9680`)




- Bug in unequal comparisons between categorical data and a scalar, which was not in the categories (e.g. ``Series(Categorical(list("abc"), ordered=True)) > "d"``. This returned ``False`` for all elements, but now raises a ``TypeError``. Equality comparisons also now return ``False`` for ``==`` and ``True`` for ``!=``. (:issue:`9848`)
- Bug in DataFrame ``__setitem__`` when right hand side is a dictionary (:issue:`9874`)
- Bug in ``where`` when dtype is ``datetime64/timedelta64``, but dtype of other is not (:issue:`9804`)
Expand All @@ -394,25 +385,13 @@ Bug Fixes
- Bug in ``DataFrame`` constructor when ``columns`` parameter is set, and ``data`` is an empty list (:issue:`9939`)
- Bug in bar plot with ``log=True`` raises ``TypeError`` if all values are less than 1 (:issue:`9905`)
- Bug in horizontal bar plot ignores ``log=True`` (:issue:`9905`)



- Bug in PyTables queries that did not return proper results using the index (:issue:`8265`, :issue:`9676`)




- Bug where dividing a dataframe containing values of type ``Decimal`` by another ``Decimal`` would raise. (:issue:`9787`)
- Bug where using DataFrames asfreq would remove the name of the index. (:issue:`9885`)
- Bug causing extra index point when resample BM/BQ (:issue:`9756`)
- Changed caching in ``AbstractHolidayCalendar`` to be at the instance level rather than at the class level as the latter can result in unexpected behaviour. (:issue:`9552`)

- Fixed latex output for multi-indexed dataframes (:issue:`9778`)
- Bug causing an exception when setting an empty range using ``DataFrame.loc`` (:issue:`9596`)




- Bug in hiding ticklabels with subplots and shared axes when adding a new plot to an existing grid of axes (:issue:`9158`)
- Bug in ``transform`` and ``filter`` when grouping on a categorical variable (:issue:`9921`)
- Bug in ``transform`` when groups are equal in number and dtype to the input index (:issue:`9700`)
Expand Down
Loading