Skip to content

DOC: clean-up v0.15.1 whatsnew file #8755

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
185 changes: 86 additions & 99 deletions doc/source/whatsnew/v0.15.1.txt
Original file line number Diff line number Diff line change
Expand Up @@ -16,29 +16,32 @@ users upgrade to this version.
API changes
~~~~~~~~~~~

- Represent ``MultiIndex`` labels with a dtype that utilizes memory based on the level size. In prior versions, the memory usage was a constant 8 bytes per element in each level. In addition, in prior versions, the *reported* memory usage was incorrect as it didn't show the usage for the memory occupied by the underling data array. (:issue:`8456`)
- ``s.dt.hour`` and other ``.dt`` accessors will now return ``np.nan`` for missing values (rather than previously -1), (:issue:`8689`)

.. ipython:: python

dfi = DataFrame(1,index=pd.MultiIndex.from_product([['a'],range(1000)]),columns=['A'])
s = Series(date_range('20130101',periods=5,freq='D'))
s.iloc[2] = np.nan
s

previous behavior:

.. code-block:: python

# this was underreported in prior versions
In [1]: dfi.memory_usage(index=True)
Out[1]:
Index 8000 # took about 24008 bytes in < 0.15.1
A 8000
In [6]: s.dt.hour
Out[6]:
0 0
1 0
2 -1
3 0
4 0
dtype: int64


current behavior:

.. ipython:: python

dfi.memory_usage(index=True)
s.dt.hour

- ``groupby`` with ``as_index=False`` will not add erroneous extra columns to
result (:issue:`8582`):
Expand Down Expand Up @@ -95,56 +98,7 @@ API changes

gr.apply(sum)

- ``concat`` permits a wider variety of iterables of pandas objects to be
passed as the first parameter (:issue:`8645`):

.. ipython:: python

from collections import deque
df1 = pd.DataFrame([1, 2, 3])
df2 = pd.DataFrame([4, 5, 6])

previous behavior:

.. code-block:: python

In [7]: pd.concat(deque((df1, df2)))
TypeError: first argument must be a list-like of pandas objects, you passed an object of type "deque"

current behavior:

.. ipython:: python

pd.concat(deque((df1, df2)))

- ``s.dt.hour`` and other ``.dt`` accessors will now return ``np.nan`` for missing values (rather than previously -1), (:issue:`8689`)

.. ipython:: python

s = Series(date_range('20130101',periods=5,freq='D'))
s.iloc[2] = np.nan
s

previous behavior:

.. code-block:: python

In [6]: s.dt.hour
Out[6]:
0 0
1 0
2 -1
3 0
4 0
dtype: int64

current behavior:

.. ipython:: python

s.dt.hour

- support for slicing with monotonic decreasing indexes, even if ``start`` or ``stop`` is
- Support for slicing with monotonic decreasing indexes, even if ``start`` or ``stop`` is
not found in the index (:issue:`7860`):

.. ipython:: python
Expand All @@ -165,14 +119,14 @@ API changes

s.loc[3.5:1.5]

- added Index properties `is_monotonic_increasing` and `is_monotonic_decreasing` (:issue:`8680`).

- ``io.data.Options`` has been fixed for a change in the format of the Yahoo Options page (:issue:`8612`), (:issue:`8741`)

.. note:: io.data.Options has been fixed for a change in the format of the Yahoo Options page (:issue:`8612`), (:issue:`8741`)
.. note::

As a result of a change in Yahoo's option page layout, when an expiry date is given,
``Options`` methods now return data for a single expiry date. Previously, methods returned all
data for the selected month.
As a result of a change in Yahoo's option page layout, when an expiry date is given,
``Options`` methods now return data for a single expiry date. Previously, methods returned all
data for the selected month.

The ``month`` and ``year`` parameters have been undeprecated and can be used to get all
options data for a given month.
Expand All @@ -185,11 +139,11 @@ API changes

New features:

The expiry parameter can now be a single date or a list-like object containing dates.
- The expiry parameter can now be a single date or a list-like object containing dates.

A new property ``expiry_dates`` was added, which returns all available expiry dates.
- A new property ``expiry_dates`` was added, which returns all available expiry dates.

current behavior:
Current behavior:

.. ipython:: python

Expand All @@ -215,16 +169,78 @@ API changes
Enhancements
~~~~~~~~~~~~

- ``concat`` permits a wider variety of iterables of pandas objects to be
passed as the first parameter (:issue:`8645`):

.. ipython:: python

from collections import deque
df1 = pd.DataFrame([1, 2, 3])
df2 = pd.DataFrame([4, 5, 6])

previous behavior:

.. code-block:: python

In [7]: pd.concat(deque((df1, df2)))
TypeError: first argument must be a list-like of pandas objects, you passed an object of type "deque"

current behavior:

.. ipython:: python

pd.concat(deque((df1, df2)))

- Represent ``MultiIndex`` labels with a dtype that utilizes memory based on the level size. In prior versions, the memory usage was a constant 8 bytes per element in each level. In addition, in prior versions, the *reported* memory usage was incorrect as it didn't show the usage for the memory occupied by the underling data array. (:issue:`8456`)

.. ipython:: python

dfi = DataFrame(1,index=pd.MultiIndex.from_product([['a'],range(1000)]),columns=['A'])

previous behavior:

.. code-block:: python

# this was underreported in prior versions
In [1]: dfi.memory_usage(index=True)
Out[1]:
Index 8000 # took about 24008 bytes in < 0.15.1
A 8000
dtype: int64


current behavior:

.. ipython:: python

dfi.memory_usage(index=True)

- Added Index properties `is_monotonic_increasing` and `is_monotonic_decreasing` (:issue:`8680`).

- Added option to select columns when importing Stata files (:issue:`7935`)

- Qualify memory usage in ``DataFrame.info()`` by adding ``+`` if it is a lower bound (:issue:`8578`)

- Raise errors in certain aggregation cases where an argument such as ``numeric_only`` is not handled (:issue:`8592`).

- Added support for 3-character ISO and non-standard country codes in :func:`io.wb.download()` (:issue:`8482`)

- :ref:`World Bank data requests <remote_data.wb>` now will warn/raise based
on an ``errors`` argument, as well as a list of hard-coded country codes and
the World Bank's JSON response. In prior versions, the error messages
didn't look at the World Bank's JSON response. Problem-inducing input were
simply dropped prior to the request. The issue was that many good countries
were cropped in the hard-coded approach. All countries will work now, but
some bad countries will raise exceptions because some edge cases break the
entire response. (:issue:`8482`)

- Added support for 3-character ISO and non-standard country codes in :func:``io.wb.download()`` (:issue:`8482`)
- :ref:`World Bank data requests <remote_data.wb>` now will warn/raise based on an ``errors`` argument, as well as a list of hard-coded country codes and the World Bank's JSON response. In prior versions, the error messages didn't look at the World Bank's JSON response. Problem-inducing input were simply dropped prior to the request. The issue was that many good countries were cropped in the hard-coded approach. All countries will work now, but some bad countries will raise exceptions because some edge cases break the entire response. (:issue:`8482`)
- Added option to ``Series.str.split()`` to return a ``DataFrame`` rather than a ``Series`` (:issue:`8428`)

- Added option to ``df.info(null_counts=None|True|False)`` to override the default display options and force showing of the null-counts (:issue:`8701`)


.. _whatsnew_0151.bug_fixes:

Bug Fixes
~~~~~~~~~

Expand All @@ -243,48 +259,19 @@ Bug Fixes
- Compat issue is ``DataFrame.dtypes`` when ``options.mode.use_inf_as_null`` is True (:issue:`8722`)
- Bug in ``read_csv``, ``dialect`` parameter would not take a string (:issue: `8703`)
- Bug in slicing a multi-index level with an empty-list (:issue:`8737`)





- Bug in numeric index operations of add/sub with Float/Index Index with numpy arrays (:issue:`8608`)
- Bug in setitem with empty indexer and unwanted coercion of dtypes (:issue:`8669`)







- Bug in ix/loc block splitting on setitem (manifests with integer-like dtypes, e.g. datetime64) (:issue:`8607`)


- Bug when doing label based indexing with integers not found in the index for
non-unique but monotonic indexes (:issue:`8680`).
- Bug when indexing a Float64Index with ``np.nan`` on numpy 1.7 (:issue:`8980`).










- Fix ``shape`` attribute for ``MultiIndex`` (:issue:`8609`)
- Bug in ``GroupBy`` where a name conflict between the grouper and columns
would break ``groupby`` operations (:issue:`7115`, :issue:`8112`)



- Fixed a bug where plotting a column ``y`` and specifying a label would mutate the index name of the original DataFrame (:issue:`8494`)
- Fix regression in plotting of a DatetimeIndex directly with matplotlib (:issue:`8614`).

- Bug in ``date_range`` where partially-specified dates would incorporate current date (:issue:`6961`)

- Bug in Setting by indexer to a scalar value with a mixed-dtype `Panel4d` was failing (:issue:`8702`)

- Bug where ``DataReader``'s would fail if one of the symbols passed was invalid. Now returns data for valid symbols and np.nan for invalid (:issue:`8494`)
- Bug in ``get_quote_yahoo`` that wouldn't allow non-float return values (:issue:`5229`).