You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/source/whatsnew/v0.19.0.txt
+79-70Lines changed: 79 additions & 70 deletions
Original file line number
Diff line number
Diff line change
@@ -1,16 +1,12 @@
1
1
.. _whatsnew_0190:
2
2
3
-
v0.19.0 (August ??, 2016)
4
-
-------------------------
3
+
v0.19.0 (September ??, 2016)
4
+
----------------------------
5
5
6
-
This is a major release from 0.18.1 and includes a small number of API changes, several new features,
6
+
This is a major release from 0.18.1 and includes number of API changes, several new features,
7
7
enhancements, and performance improvements along with a large number of bug fixes. We recommend that all
8
8
users upgrade to this version.
9
9
10
-
.. warning::
11
-
12
-
pandas >= 0.19.0 will no longer silence numpy ufunc warnings upon import, see :ref:`here <whatsnew_0190.errstate>`.
13
-
14
10
Highlights include:
15
11
16
12
- :func:`merge_asof` for asof-style time-series joining, see :ref:`here <whatsnew_0190.enhancements.asof_merge>`
@@ -21,6 +17,10 @@ Highlights include:
21
17
- ``PeriodIndex`` now has its own ``period`` dtype, and changed to be more consistent with other ``Index`` classes. See :ref:`here <whatsnew_0190.api.period>`
22
18
- Sparse data structures now gained enhanced support of ``int`` and ``bool`` dtypes, see :ref:`here <whatsnew_0190.sparse>`
23
19
20
+
.. warning::
21
+
22
+
pandas >= 0.19.0 will no longer silence numpy ufunc warnings upon import, see :ref:`here <whatsnew_0190.errstate>`.
23
+
24
24
.. contents:: What's new in v0.19.0
25
25
:local:
26
26
:backlinks: none
@@ -35,7 +35,7 @@ New features
35
35
pandas development API
36
36
^^^^^^^^^^^^^^^^^^^^^^
37
37
38
-
As part of making pandas APi more uniform and accessible in the future, we have created a standard
38
+
As part of making pandas API more uniform and accessible in the future, we have created a standard
39
39
sub-package of pandas, ``pandas.api`` to hold public API's. We are starting by exposing type
40
40
introspection functions in ``pandas.api.types``. More sub-packages and officially sanctioned API's
41
41
will be published in future versions of pandas (:issue:`13147`, :issue:`13634`)
@@ -215,7 +215,7 @@ default of the index) in a DataFrame.
215
215
:ref:`Duplicate column names <io.dupe_names>` are now supported in :func:`read_csv` whether
216
216
they are in the file or passed in as the ``names`` parameter (:issue:`7160`, :issue:`9424`)
217
217
218
-
.. ipython:: python
218
+
.. ipython:: python
219
219
220
220
data = '0,1,2\n3,4,5'
221
221
names = ['a', 'b', 'a']
@@ -230,25 +230,25 @@ Previous Behavior:
230
230
0 2 1 2
231
231
1 5 4 5
232
232
233
-
The first ``a`` column contains the same data as the second ``a`` column, when it should have
233
+
The first ``a`` column contained the same data as the second ``a`` column, when it should have
The :func:`read_csv` function now supports parsing a ``Categorical`` column when
249
249
specified as a dtype (:issue:`10153`). Depending on the structure of the data,
250
250
this can result in a faster parse time and lower memory usage compared to
251
-
converting to ``Categorical`` after parsing. See the io :ref:`docs here <io.categorical>`
251
+
converting to ``Categorical`` after parsing. See the io :ref:`docs here <io.categorical>`.
252
252
253
253
.. ipython:: python
254
254
@@ -407,8 +407,8 @@ After upgrading pandas, you may see *new* ``RuntimeWarnings`` being issued from
407
407
408
408
.. _whatsnew_0190.get_dummies_dtypes:
409
409
410
-
get_dummies dtypes
411
-
^^^^^^^^^^^^^^^^^^
410
+
``get_dummies`` now returns integer dtypes
411
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
412
412
413
413
The ``pd.get_dummies`` function now returns dummy-encoded columns as small integers, rather than floats (:issue:`8725`). This should provide an improved memory footprint.
- The ``.get_credentials()`` method of ``GbqConnector`` can now first try to fetch `the application default credentials <https://developers.google.com/identity/protocols/application-default-credentials>`__. See the :ref:`docs <io.bigquery_authentication>` for more details (:issue:`13577`).
436
+
Downcast values to smallest possible dtype in ``to_numeric``
- The ``.tz_localize()`` method of ``DatetimeIndex`` and ``Timestamp`` has gained the ``errors`` keyword, so you can potentially coerce nonexistent timestamps to ``NaT``. The default behavior remains to raising a ``NonExistentTimeError`` (:issue:`13057`)
442
-
- ``pd.to_numeric()`` now accepts a ``downcast`` parameter, which will downcast the data if possible to smallest specified numerical dtype (:issue:`13352`)
439
+
``pd.to_numeric()`` now accepts a ``downcast`` parameter, which will downcast the data if possible to smallest specified numerical dtype (:issue:`13352`)
443
440
444
441
.. ipython:: python
445
442
446
443
s = ['1', 2, 3]
447
444
pd.to_numeric(s, downcast='unsigned')
448
445
pd.to_numeric(s, downcast='integer')
449
446
447
+
448
+
.. _whatsnew_0190.enhancements.other:
449
+
450
+
Other enhancements
451
+
^^^^^^^^^^^^^^^^^^
452
+
453
+
- The ``.get_credentials()`` method of ``GbqConnector`` can now first try to fetch `the application default credentials <https://developers.google.com/identity/protocols/application-default-credentials>`__. See the :ref:`docs <io.bigquery_authentication>` for more details (:issue:`13577`).
454
+
455
+
- The ``.tz_localize()`` method of ``DatetimeIndex`` and ``Timestamp`` has gained the ``errors`` keyword, so you can potentially coerce nonexistent timestamps to ``NaT``. The default behaviour remains to raising a ``NonExistentTimeError`` (:issue:`13057`)
456
+
450
457
- ``.to_hdf/read_hdf()`` now accept path objects (e.g. ``pathlib.Path``, ``py.path.local``) for the file path (:issue:`11773`)
451
458
452
459
- ``Timestamp`` can now accept positional and keyword parameters similar to :func:`datetime.datetime` (:issue:`10758`, :issue:`11630`)
@@ -471,13 +478,10 @@ Other enhancements
471
478
df.resample('M', on='date').sum()
472
479
df.resample('M', level='d').sum()
473
480
474
-
- The ``pd.read_csv()`` with ``engine='python'`` has gained support for the ``decimal`` option (:issue:`12933`)
475
-
- The ``pd.read_csv()`` with ``engine='python'`` has gained support for the ``na_filter`` option (:issue:`13321`)
476
-
- The ``pd.read_csv()`` with ``engine='python'`` has gained support for the ``memory_map`` option (:issue:`13381`)
481
+
- The ``pd.read_csv()`` with ``engine='python'`` has gained support for the
482
+
``decimal`` (:issue:`12933`), ``na_filter`` (:issue:`13321`) and the ``memory_map`` option (:issue:`13381`).
477
483
- Consistent with the Python API, ``pd.read_csv()`` will now interpret ``+inf`` as positive infinity (:issue:`13274`)
478
-
479
484
- The ``pd.read_html()`` has gained support for the ``na_values``, ``converters``, ``keep_default_na`` options (:issue:`13461`)
480
-
481
485
- ``Categorical.astype()`` now accepts an optional boolean argument ``copy``, effective when dtype is categorical (:issue:`13209`)
482
486
- ``DataFrame`` has gained the ``.asof()`` method to return the last non-NaN values according to the selected subset (:issue:`13358`)
483
487
- The ``DataFrame`` constructor will now respect key ordering if a list of ``OrderedDict`` objects are passed in (:issue:`13304`)
@@ -504,43 +508,14 @@ Other enhancements
504
508
- :meth:`~DataFrame.to_html` now has a ``border`` argument to control the value in the opening ``<table>`` tag. The default is the value of the ``html.border`` option, which defaults to 1. This also affects the notebook HTML repr, but since Jupyter's CSS includes a border-width attribute, the visual effect is the same. (:issue:`11563`).
505
509
- Raise ``ImportError`` in the sql functions when ``sqlalchemy`` is not installed and a connection string is used (:issue:`11920`).
506
510
- Compatibility with matplotlib 2.0. Older versions of pandas should also work with matplotlib 2.0 (:issue:`13333`)
507
-
508
-
.. _whatsnew_0190.api:
509
-
510
-
511
-
API changes
512
-
~~~~~~~~~~~
513
-
514
-
515
-
- ``Timestamp.to_pydatetime`` will issue a ``UserWarning`` when ``warn=True``, and the instance has a non-zero number of nanoseconds, previously this would print a message to stdout. (:issue:`14101`)
516
-
- Non-convertible dates in an excel date column will be returned without conversion and the column will be ``object`` dtype, rather than raising an exception (:issue:`10001`)
517
-
- ``Series.unique()`` with datetime and timezone now returns return array of ``Timestamp`` with timezone (:issue:`13565`)
518
511
- ``Timestamp``, ``Period``, ``DatetimeIndex``, ``PeriodIndex`` and ``.dt`` accessor have gained a ``.is_leap_year`` property to check whether the date belongs to a leap year. (:issue:`13727`)
519
-
- ``pd.Timedelta(None)`` is now accepted and will return ``NaT``, mirroring ``pd.Timestamp`` (:issue:`13687`)
520
-
- ``Panel.to_sparse()`` will raise a ``NotImplementedError`` exception when called (:issue:`13778`)
521
-
- ``Index.reshape()`` will raise a ``NotImplementedError`` exception when called (:issue:`12882`)
522
-
- ``.filter()`` enforces mutual exclusion of the keyword arguments. (:issue:`12399`)
523
-
- ``eval``'s upcasting rules for ``float32`` types have been updated to be more consistent with NumPy's rules. New behavior will not upcast to ``float64`` if you multiply a pandas ``float32`` object by a scalar float64. (:issue:`12388`)
524
-
- An ``UnsupportedFunctionCall`` error is now raised if NumPy ufuncs like ``np.mean`` are called on groupby or resample objects (:issue:`12811`)
525
-
- ``__setitem__`` will no longer apply a callable rhs as a function instead of storing it. Call ``where`` directly to get the previous behavior. (:issue:`13299`)
526
-
- Calls to ``.sample()`` will respect the random seed set via ``numpy.random.seed(n)`` (:issue:`13161`)
527
-
- ``Styler.apply`` is now more strict about the outputs your function must return. For ``axis=0`` or ``axis=1``, the output shape must be identical. For ``axis=None``, the output must be a DataFrame with identical columns and index labels. (:issue:`13222`)
528
-
- ``Float64Index.astype(int)`` will now raise ``ValueError`` if ``Float64Index`` contains ``NaN`` values (:issue:`13149`)
529
-
- ``TimedeltaIndex.astype(int)`` and ``DatetimeIndex.astype(int)`` will now return ``Int64Index`` instead of ``np.array`` (:issue:`13209`)
530
-
- Passing ``Period`` with multiple frequencies to normal ``Index`` now returns ``Index`` with ``object`` dtype (:issue:`13664`)
531
-
- ``PeridIndex`` can now accept ``list`` and ``array`` which contains ``pd.NaT`` (:issue:`13430`)
532
-
- ``PeriodIndex.fillna`` with ``Period`` has different freq now coerces to ``object`` dtype (:issue:`13664`)
533
-
- Faceted boxplots from ``DataFrame.boxplot(by=col)`` now return a ``Series`` when ``return_type`` is not None. Previously these returned an ``OrderedDict``. Note that when ``return_type=None``, the default, these still return a 2-D NumPy array. (:issue:`12216`, :issue:`7096`)
534
512
- ``astype()`` will now accept a dict of column name to data types mapping as the ``dtype`` argument. (:issue:`12086`)
535
513
- The ``pd.read_json`` and ``DataFrame.to_json`` has gained support for reading and writing json lines with ``lines`` option see :ref:`Line delimited json <io.jsonl>` (:issue:`9180`)
536
-
- ``pd.read_hdf`` will now raise a ``ValueError`` instead of ``KeyError``, if a mode other than ``r``, ``r+`` and ``a`` is supplied. (:issue:`13623`)
537
-
- ``pd.read_csv()``, ``pd.read_table()``, and ``pd.read_hdf()`` raise the builtin ``FileNotFoundError`` exception for Python 3.x when called on a nonexistent file; this is back-ported as ``IOError`` in Python 2.x (:issue:`14086`)
538
-
- More informative exceptions are passed through the csv parser. The exception type would now be the original exception type instead of ``CParserError``. (:issue:`13652`)
539
-
- ``pd.read_csv()`` in the C engine will now issue a ``ParserWarning`` or raise a ``ValueError`` when ``sep`` encoded is more than one character long (:issue:`14065`)
540
-
- ``DataFrame.values`` will now return ``float64`` with a ``DataFrame`` of mixed ``int64`` and ``uint64`` dtypes, conforming to ``np.find_common_type`` (:issue:`10364`, :issue:`13917`)
541
514
515
+
.. _whatsnew_0190.api:
542
516
543
-
.. _whatsnew_0190.api.tolist:
517
+
API changes
518
+
~~~~~~~~~~~
544
519
545
520
``Series.tolist()`` will now return Python types
546
521
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -551,7 +526,6 @@ API changes
551
526
.. ipython:: python
552
527
553
528
s = pd.Series([1,2,3])
554
-
type(s.tolist()[0])
555
529
556
530
Previous Behavior:
557
531
@@ -572,11 +546,11 @@ New Behavior:
572
546
``Series`` operators for different indexes
573
547
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
574
548
575
-
Following ``Series`` operators has been changed to make all operators consistent,
549
+
Following ``Series`` operators have been changed to make all operators consistent,
576
550
including ``DataFrame`` (:issue:`1134`, :issue:`4581`, :issue:`13538`)
577
551
578
552
- ``Series`` comparison operators now raise ``ValueError`` when ``index`` are different.
579
-
- ``Series`` logical operators align both ``index``.
553
+
- ``Series`` logical operators align both ``index`` of left and right hand side.
580
554
581
555
.. warning::
582
556
Until 0.18.1, comparing ``Series`` with the same length, would succeed even if
@@ -607,7 +581,7 @@ Comparison operators raise ``ValueError`` when ``.index`` are different.
607
581
608
582
Previous Behavior (``Series``):
609
583
610
-
``Series`` compares values ignoring ``.index`` as long as both lengthes are the same.
584
+
``Series`` compared values ignoring the ``.index`` as long as both had the same length:
611
585
612
586
.. code-block:: ipython
613
587
@@ -627,13 +601,18 @@ New Behavior (``Series``):
627
601
ValueError: Can only compare identically-labeled Series objects
628
602
629
603
.. note::
604
+
630
605
To achieve the same result as previous versions (compare values based on locations ignoring ``.index``), compare both ``.values``.
631
606
632
607
.. ipython:: python
633
608
634
609
s1.values == s2.values
635
610
636
-
If you want to compare ``Series`` aligning its ``.index``, see flexible comparison methods section below.
611
+
If you want to compare ``Series`` aligning its ``.index``, see flexible comparison methods section below:
612
+
613
+
.. ipython:: python
614
+
615
+
s1.eq(s2)
637
616
638
617
Current Behavior (``DataFrame``, no change):
639
618
@@ -646,9 +625,9 @@ Current Behavior (``DataFrame``, no change):
646
625
Logical operators
647
626
"""""""""""""""""
648
627
649
-
Logical operators align both ``.index``.
628
+
Logical operators align both ``.index`` of left and right hand side.
650
629
651
-
Previous behavior (``Series``), only left hand side ``index`` is kept:
630
+
Previous behavior (``Series``), only left hand side ``index`` was kept:
652
631
653
632
.. code-block:: ipython
654
633
@@ -673,11 +652,11 @@ New Behavior (``Series``):
673
652
``Series`` logical operators fill a ``NaN`` result with ``False``.
674
653
675
654
.. note::
676
-
To achieve the same result as previous versions (compare values based on locations ignoring ``.index``), compare both ``.values``.
655
+
To achieve the same result as previous versions (compare values based on only left hand side index), you can use ``reindex_like``:
677
656
678
657
.. ipython:: python
679
658
680
-
s1.values & s2.values
659
+
s1 & s2.reindex_like(s1)
681
660
682
661
Current Behavior (``DataFrame``, no change):
683
662
@@ -1319,6 +1298,35 @@ New Behavior:
1319
1298
In [2]: i.get_indexer(['b', 'b', 'c']).dtype
1320
1299
Out[2]: dtype('int64')
1321
1300
1301
+
1302
+
.. _whatsnew_0190.api.other:
1303
+
1304
+
Other API Changes
1305
+
^^^^^^^^^^^^^^^^^
1306
+
1307
+
- ``Timestamp.to_pydatetime`` will issue a ``UserWarning`` when ``warn=True``, and the instance has a non-zero number of nanoseconds, previously this would print a message to stdout. (:issue:`14101`)
1308
+
- Non-convertible dates in an excel date column will be returned without conversion and the column will be ``object`` dtype, rather than raising an exception (:issue:`10001`)
1309
+
- ``Series.unique()`` with datetime and timezone now returns return array of ``Timestamp`` with timezone (:issue:`13565`)
1310
+
- ``pd.Timedelta(None)`` is now accepted and will return ``NaT``, mirroring ``pd.Timestamp`` (:issue:`13687`)
1311
+
- ``Panel.to_sparse()`` will raise a ``NotImplementedError`` exception when called (:issue:`13778`)
1312
+
- ``Index.reshape()`` will raise a ``NotImplementedError`` exception when called (:issue:`12882`)
1313
+
- ``.filter()`` enforces mutual exclusion of the keyword arguments. (:issue:`12399`)
1314
+
- ``eval``'s upcasting rules for ``float32`` types have been updated to be more consistent with NumPy's rules. New behavior will not upcast to ``float64`` if you multiply a pandas ``float32`` object by a scalar float64. (:issue:`12388`)
1315
+
- An ``UnsupportedFunctionCall`` error is now raised if NumPy ufuncs like ``np.mean`` are called on groupby or resample objects (:issue:`12811`)
1316
+
- ``__setitem__`` will no longer apply a callable rhs as a function instead of storing it. Call ``where`` directly to get the previous behavior. (:issue:`13299`)
1317
+
- Calls to ``.sample()`` will respect the random seed set via ``numpy.random.seed(n)`` (:issue:`13161`)
1318
+
- ``Styler.apply`` is now more strict about the outputs your function must return. For ``axis=0`` or ``axis=1``, the output shape must be identical. For ``axis=None``, the output must be a DataFrame with identical columns and index labels. (:issue:`13222`)
1319
+
- ``Float64Index.astype(int)`` will now raise ``ValueError`` if ``Float64Index`` contains ``NaN`` values (:issue:`13149`)
1320
+
- ``TimedeltaIndex.astype(int)`` and ``DatetimeIndex.astype(int)`` will now return ``Int64Index`` instead of ``np.array`` (:issue:`13209`)
1321
+
- Passing ``Period`` with multiple frequencies to normal ``Index`` now returns ``Index`` with ``object`` dtype (:issue:`13664`)
1322
+
- ``PeriodIndex.fillna`` with ``Period`` has different freq now coerces to ``object`` dtype (:issue:`13664`)
1323
+
- Faceted boxplots from ``DataFrame.boxplot(by=col)`` now return a ``Series`` when ``return_type`` is not None. Previously these returned an ``OrderedDict``. Note that when ``return_type=None``, the default, these still return a 2-D NumPy array. (:issue:`12216`, :issue:`7096`)
1324
+
- ``pd.read_hdf`` will now raise a ``ValueError`` instead of ``KeyError``, if a mode other than ``r``, ``r+`` and ``a`` is supplied. (:issue:`13623`)
1325
+
- ``pd.read_csv()``, ``pd.read_table()``, and ``pd.read_hdf()`` raise the builtin ``FileNotFoundError`` exception for Python 3.x when called on a nonexistent file; this is back-ported as ``IOError`` in Python 2.x (:issue:`14086`)
1326
+
- More informative exceptions are passed through the csv parser. The exception type would now be the original exception type instead of ``CParserError``. (:issue:`13652`)
1327
+
- ``pd.read_csv()`` in the C engine will now issue a ``ParserWarning`` or raise a ``ValueError`` when ``sep`` encoded is more than one character long (:issue:`14065`)
1328
+
- ``DataFrame.values`` will now return ``float64`` with a ``DataFrame`` of mixed ``int64`` and ``uint64`` dtypes, conforming to ``np.find_common_type`` (:issue:`10364`, :issue:`13917`)
1329
+
1322
1330
.. _whatsnew_0190.deprecations:
1323
1331
1324
1332
Deprecations
@@ -1568,3 +1576,4 @@ Bug Fixes
1568
1576
1569
1577
- Bug in ``eval()`` where the ``resolvers`` argument would not accept a list (:issue:`14095`)
1570
1578
- Bugs in ``stack``, ``get_dummies``, ``make_axis_dummies`` which don't preserve categorical dtypes in (multi)indexes (:issue:`13854`)
1579
+
- ``PeridIndex`` can now accept ``list`` and ``array`` which contains ``pd.NaT`` (:issue:`13430`)
0 commit comments