Skip to content

Commit 4d4a2e3

Browse files
committed
Merge pull request #11059 from jorisvandenbossche/whatsnew017
DOC: clean up 0.17 whatsnew
2 parents 52f4b75 + 7d35d97 commit 4d4a2e3

File tree

1 file changed

+106
-95
lines changed

1 file changed

+106
-95
lines changed

doc/source/whatsnew/v0.17.0.txt

Lines changed: 106 additions & 95 deletions
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,14 @@ users upgrade to this version.
2121

2222
After installing pandas-datareader, you can easily change your imports:
2323

24-
.. code-block:: Python
24+
.. code-block:: python
25+
26+
from pandas.io import data, wb
27+
28+
becomes
29+
30+
.. code-block:: python
2531

26-
from pandas.io import data, wb # becomes
2732
from pandas_datareader import data, wb
2833

2934
Highlights include:
@@ -53,44 +58,60 @@ Check the :ref:`API Changes <whatsnew_0170.api>` and :ref:`deprecations <whatsne
5358
New features
5459
~~~~~~~~~~~~
5560

56-
- ``merge`` now accepts the argument ``indicator`` which adds a Categorical-type column (by default called ``_merge``) to the output object that takes on the values (:issue:`8790`)
61+
.. _whatsnew_0170.tz:
5762

58-
=================================== ================
59-
Observation Origin ``_merge`` value
60-
=================================== ================
61-
Merge key only in ``'left'`` frame ``left_only``
62-
Merge key only in ``'right'`` frame ``right_only``
63-
Merge key in both frames ``both``
64-
=================================== ================
63+
Datetime with TZ
64+
^^^^^^^^^^^^^^^^
6565

66-
.. ipython:: python
66+
We are adding an implementation that natively supports datetime with timezones. A ``Series`` or a ``DataFrame`` column previously
67+
*could* be assigned a datetime with timezones, and would work as an ``object`` dtype. This had performance issues with a large
68+
number rows. See the :ref:`docs <timeseries.timezone_series>` for more details. (:issue:`8260`, :issue:`10763`, :issue:`11034`).
6769

68-
df1 = pd.DataFrame({'col1':[0,1], 'col_left':['a','b']})
69-
df2 = pd.DataFrame({'col1':[1,2,2],'col_right':[2,2,2]})
70-
pd.merge(df1, df2, on='col1', how='outer', indicator=True)
70+
The new implementation allows for having a single-timezone across all rows, with operations in a performant manner.
7171

72-
For more, see the :ref:`updated docs <merging.indicator>`
72+
.. ipython:: python
7373

74-
- ``DataFrame`` has gained the ``nlargest`` and ``nsmallest`` methods (:issue:`10393`)
75-
- SQL io functions now accept a SQLAlchemy connectable. (:issue:`7877`)
76-
- Enable writing complex values to HDF stores when using table format (:issue:`10447`)
77-
- Enable reading gzip compressed files via URL, either by explicitly setting the compression parameter or by inferring from the presence of the HTTP Content-Encoding header in the response (:issue:`8685`)
78-
- Add a ``limit_direction`` keyword argument that works with ``limit`` to enable ``interpolate`` to fill ``NaN`` values forward, backward, or both (:issue:`9218` and :issue:`10420`)
74+
df = DataFrame({'A' : date_range('20130101',periods=3),
75+
'B' : date_range('20130101',periods=3,tz='US/Eastern'),
76+
'C' : date_range('20130101',periods=3,tz='CET')})
77+
df
78+
df.dtypes
7979

80-
.. ipython:: python
80+
.. ipython:: python
8181

82-
ser = pd.Series([np.nan, np.nan, 5, np.nan, np.nan, np.nan, 13])
83-
ser.interpolate(limit=1, limit_direction='both')
82+
df.B
83+
df.B.dt.tz_localize(None)
8484

85-
- Round DataFrame to variable number of decimal places (:issue:`10568`).
85+
This uses a new-dtype representation as well, that is very similar in look-and-feel to its numpy cousin ``datetime64[ns]``
8686

87-
.. ipython :: python
87+
.. ipython:: python
8888

89-
df = pd.DataFrame(np.random.random([3, 3]), columns=['A', 'B', 'C'],
90-
index=['first', 'second', 'third'])
91-
df
92-
df.round(2)
93-
df.round({'A': 0, 'C': 2})
89+
df['B'].dtype
90+
type(df['B'].dtype)
91+
92+
.. note::
93+
94+
There is a slightly different string repr for the underlying ``DatetimeIndex`` as a result of the dtype changes, but
95+
functionally these are the same.
96+
97+
Previous Behavior:
98+
99+
.. code-block:: python
100+
101+
In [1]: pd.date_range('20130101',periods=3,tz='US/Eastern')
102+
Out[1]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 00:00:00-05:00',
103+
'2013-01-03 00:00:00-05:00'],
104+
dtype='datetime64[ns]', freq='D', tz='US/Eastern')
105+
106+
In [2]: pd.date_range('20130101',periods=3,tz='US/Eastern').dtype
107+
Out[2]: dtype('<M8[ns]')
108+
109+
New Behavior:
110+
111+
.. ipython:: python
112+
113+
pd.date_range('20130101',periods=3,tz='US/Eastern')
114+
pd.date_range('20130101',periods=3,tz='US/Eastern').dtype
94115

95116
.. _whatsnew_0170.gil:
96117

@@ -286,6 +307,46 @@ has been changed to make this keyword unnecessary - the change is shown below.
286307
Other enhancements
287308
^^^^^^^^^^^^^^^^^^
288309

310+
311+
- ``merge`` now accepts the argument ``indicator`` which adds a Categorical-type column (by default called ``_merge``) to the output object that takes on the values (:issue:`8790`)
312+
313+
=================================== ================
314+
Observation Origin ``_merge`` value
315+
=================================== ================
316+
Merge key only in ``'left'`` frame ``left_only``
317+
Merge key only in ``'right'`` frame ``right_only``
318+
Merge key in both frames ``both``
319+
=================================== ================
320+
321+
.. ipython:: python
322+
323+
df1 = pd.DataFrame({'col1':[0,1], 'col_left':['a','b']})
324+
df2 = pd.DataFrame({'col1':[1,2,2],'col_right':[2,2,2]})
325+
pd.merge(df1, df2, on='col1', how='outer', indicator=True)
326+
327+
For more, see the :ref:`updated docs <merging.indicator>`
328+
329+
- ``DataFrame`` has gained the ``nlargest`` and ``nsmallest`` methods (:issue:`10393`)
330+
- SQL io functions now accept a SQLAlchemy connectable. (:issue:`7877`)
331+
- Enable writing complex values to HDF stores when using table format (:issue:`10447`)
332+
- Enable reading gzip compressed files via URL, either by explicitly setting the compression parameter or by inferring from the presence of the HTTP Content-Encoding header in the response (:issue:`8685`)
333+
- Add a ``limit_direction`` keyword argument that works with ``limit`` to enable ``interpolate`` to fill ``NaN`` values forward, backward, or both (:issue:`9218` and :issue:`10420`)
334+
335+
.. ipython:: python
336+
337+
ser = pd.Series([np.nan, np.nan, 5, np.nan, np.nan, np.nan, 13])
338+
ser.interpolate(limit=1, limit_direction='both')
339+
340+
- Round DataFrame to variable number of decimal places (:issue:`10568`).
341+
342+
.. ipython :: python
343+
344+
df = pd.DataFrame(np.random.random([3, 3]), columns=['A', 'B', 'C'],
345+
index=['first', 'second', 'third'])
346+
df
347+
df.round(2)
348+
df.round({'A': 0, 'C': 2})
349+
289350
- ``pd.read_sql`` and ``to_sql`` can accept database URI as ``con`` parameter (:issue:`10214`)
290351
- Enable ``pd.read_hdf`` to be used without specifying a key when the HDF file contains a single dataset (:issue:`10443`)
291352
- Enable writing Excel files in :ref:`memory <_io.excel_writing_buffer>` using StringIO/BytesIO (:issue:`7074`)
@@ -321,13 +382,15 @@ Other enhancements
321382
Timestamp('2014')
322383
DatetimeIndex(['2012Q2', '2014'])
323384

324-
.. note:: If you want to perform calculations based on today's date, use ``Timestamp.now()`` and ``pandas.tseries.offsets``.
385+
.. note::
325386

326-
.. ipython:: python
387+
If you want to perform calculations based on today's date, use ``Timestamp.now()`` and ``pandas.tseries.offsets``.
327388

328-
import pandas.tseries.offsets as offsets
329-
Timestamp.now()
330-
Timestamp.now() + offsets.DateOffset(years=1)
389+
.. ipython:: python
390+
391+
import pandas.tseries.offsets as offsets
392+
Timestamp.now()
393+
Timestamp.now() + offsets.DateOffset(years=1)
331394

332395
- ``to_datetime`` can now accept ``yearfirst`` keyword (:issue:`7599`)
333396

@@ -411,6 +474,9 @@ Other enhancements
411474

412475
pd.concat([foo, bar, baz], 1)
413476

477+
- Allow passing `kwargs` to the interpolation methods (:issue:`10378`).
478+
- Improved error message when concatenating an empty iterable of dataframes (:issue:`9157`)
479+
414480

415481
.. _whatsnew_0170.api:
416482

@@ -516,60 +582,6 @@ To keep the previous behaviour, you can use ``errors='ignore'``:
516582
Furthermore, ``pd.to_timedelta`` has gained a similar API, of ``errors='raise'|'ignore'|'coerce'``, and the ``coerce`` keyword
517583
has been deprecated in favor of ``errors='coerce'``.
518584

519-
.. _whatsnew_0170.tz:
520-
521-
Datetime with TZ
522-
~~~~~~~~~~~~~~~~
523-
524-
We are adding an implementation that natively supports datetime with timezones. A ``Series`` or a ``DataFrame`` column previously
525-
*could* be assigned a datetime with timezones, and would work as an ``object`` dtype. This had performance issues with a large
526-
number rows. See the :ref:`docs <timeseries.timezone_series>` for more details. (:issue:`8260`, :issue:`10763`, :issue:`11034`).
527-
528-
The new implementation allows for having a single-timezone across all rows, with operations in a performant manner.
529-
530-
.. ipython:: python
531-
532-
df = DataFrame({'A' : date_range('20130101',periods=3),
533-
'B' : date_range('20130101',periods=3,tz='US/Eastern'),
534-
'C' : date_range('20130101',periods=3,tz='CET')})
535-
df
536-
df.dtypes
537-
538-
.. ipython:: python
539-
540-
df.B
541-
df.B.dt.tz_localize(None)
542-
543-
This uses a new-dtype representation as well, that is very similar in look-and-feel to its numpy cousin ``datetime64[ns]``
544-
545-
.. ipython:: python
546-
547-
df['B'].dtype
548-
type(df['B'].dtype)
549-
550-
.. note::
551-
552-
There is a slightly different string repr for the underlying ``DatetimeIndex`` as a result of the dtype changes, but
553-
functionally these are the same.
554-
555-
Previous Behavior:
556-
557-
.. code-block:: python
558-
559-
In [1]: pd.date_range('20130101',periods=3,tz='US/Eastern')
560-
Out[1]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 00:00:00-05:00',
561-
'2013-01-03 00:00:00-05:00'],
562-
dtype='datetime64[ns]', freq='D', tz='US/Eastern')
563-
564-
In [2]: pd.date_range('20130101',periods=3,tz='US/Eastern').dtype
565-
Out[2]: dtype('<M8[ns]')
566-
567-
New Behavior:
568-
569-
.. ipython:: python
570-
571-
pd.date_range('20130101',periods=3,tz='US/Eastern')
572-
pd.date_range('20130101',periods=3,tz='US/Eastern').dtype
573585

574586
.. _whatsnew_0170.api_breaking.convert_objects:
575587

@@ -847,11 +859,10 @@ Other API Changes
847859

848860
- Line and kde plot with ``subplots=True`` now uses default colors, not all black. Specify ``color='k'`` to draw all lines in black (:issue:`9894`)
849861
- Calling the ``.value_counts()`` method on a Series with ``categorical`` dtype now returns a Series with a ``CategoricalIndex`` (:issue:`10704`)
850-
- Allow passing `kwargs` to the interpolation methods (:issue:`10378`).
851862
- The metadata properties of subclasses of pandas objects will now be serialized (:issue:`10553`).
852863
- ``groupby`` using ``Categorical`` follows the same rule as ``Categorical.unique`` described above (:issue:`10508`)
853-
- Improved error message when concatenating an empty iterable of dataframes (:issue:`9157`)
854-
- When constructing ``DataFrame`` with an array of ``complex64`` dtype that meant the corresponding column was automatically promoted to the ``complex128`` dtype. Pandas will now preserve the itemsize of the input for complex data (:issue:`10952`)
864+
- When constructing ``DataFrame`` with an array of ``complex64`` dtype previously meant the corresponding column
865+
was automatically promoted to the ``complex128`` dtype. Pandas will now preserve the itemsize of the input for complex data (:issue:`10952`)
855866

856867
- ``NaT``'s methods now either raise ``ValueError``, or return ``np.nan`` or ``NaT`` (:issue:`9513`)
857868

@@ -869,8 +880,6 @@ Other API Changes
869880
Deprecations
870881
^^^^^^^^^^^^
871882

872-
.. note:: These indexing function have been deprecated in the documentation since 0.11.0.
873-
874883
- For ``Series`` the following indexing functions are deprecated (:issue:`10177`).
875884

876885
===================== =================================
@@ -891,6 +900,8 @@ Deprecations
891900
``.icol(j)`` ``.iloc[:, j]``
892901
===================== =================================
893902

903+
.. note:: These indexing function have been deprecated in the documentation since 0.11.0.
904+
894905
- ``Categorical.name`` was deprecated to make ``Categorical`` more ``numpy.ndarray`` like. Use ``Series(cat, name="whatever")`` instead (:issue:`10482`).
895906
- Setting missing values (NaN) in a ``Categorical``'s ``categories`` will issue a warning (:issue:`10748`). You can still have missing values in the ``values``.
896907
- ``drop_duplicates`` and ``duplicated``'s ``take_last`` keyword was deprecated in favor of ``keep``. (:issue:`6511`, :issue:`8505`)
@@ -908,7 +919,6 @@ Deprecations
908919
Removal of prior version deprecations/changes
909920
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
910921

911-
- Remove use of some deprecated numpy comparison operations, mainly in tests. (:issue:`10569`)
912922
- Removal of ``na_last`` parameters from ``Series.order()`` and ``Series.sort()``, in favor of ``na_position``, xref (:issue:`5231`)
913923
- Remove of ``percentile_width`` from ``.describe()``, in favor of ``percentiles``. (:issue:`7088`)
914924
- Removal of ``colSpace`` parameter from ``DataFrame.to_string()``, in favor of ``col_space``, circa 0.8.0 version.
@@ -1089,3 +1099,4 @@ Bug Fixes
10891099
- Bug in ``Index`` arithmetic may result in incorrect class (:issue:`10638`)
10901100
- Bug in ``date_range`` results in empty if freq is negative annualy, quarterly and monthly (:issue:`11018`)
10911101
- Bug in ``DatetimeIndex`` cannot infer negative freq (:issue:`11018`)
1102+
- Remove use of some deprecated numpy comparison operations, mainly in tests. (:issue:`10569`)

0 commit comments

Comments
 (0)