Skip to content

Commit 8c2e931

Browse files
committed
synch whatsnew/v0.22.0.txt from upstream master
1 parent 25373d3 commit 8c2e931

File tree

1 file changed

+65
-99
lines changed

1 file changed

+65
-99
lines changed

doc/source/whatsnew/v0.22.0.txt

Lines changed: 65 additions & 99 deletions
Original file line numberDiff line numberDiff line change
@@ -17,122 +17,76 @@ New features
1717
-
1818
-
1919

20-
.. _whatsnew_0220.enhancements.other:
21-
22-
Other Enhancements
23-
^^^^^^^^^^^^^^^^^^
24-
25-
correctly calculate ranks for infinit values
26-
""""""""""""""""""""""""""""""""""""""""""""
27-
28-
In previous versions, ``inf`` elements were assigned ``nan`` as their ranks. Now ranks are calculated properly.
29-
30-
Previous Behavior:
31-
32-
.. code-block:: ipython
33-
34-
In [17]: pd.Series([-np.inf, 0, 1, np.inf]).rank()
35-
Out[17]:
36-
0 1.0
37-
1 2.0
38-
2 3.0
39-
3 NaN
40-
41-
Current Behavior
4220

43-
.. code-block:: ipython
21+
.. _whatsnew_0210.enhancements.get_dummies_dtype:
4422

45-
In [5]: pd.Series([-np.inf, 0, 1, np.inf]).rank()
46-
Out[5]:
47-
0 1.0
48-
1 2.0
49-
2 3.0
50-
3 4.0
51-
dtype: float64
23+
``get_dummies`` now supports ``dtype`` argument
24+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5225

53-
Furthermore, previously if you rank ``inf`` or ``-inf`` values together with ``nan`` values, results were wrong when using 'top' or 'bottom' argument.
26+
The :func:`get_dummies` now accepts a ``dtype`` argument, which specifies a dtype for the new columns. The default remains uint8. (:issue:`18330`)
5427

55-
Previously Behavior:
28+
.. ipython:: python
5629

57-
.. code-block:: ipython
30+
df = pd.DataFrame({'a': [1, 2], 'b': [3, 4], 'c': [5, 6]})
31+
pd.get_dummies(df, columns=['c']).dtypes
32+
pd.get_dummies(df, columns=['c'], dtype=bool).dtypes
5833

59-
In [15]: pd.Series([np.nan, np.nan, -np.inf, -np.inf]).rank(na_option='top')
60-
Out[15]:
61-
0 2.5
62-
1 2.5
63-
2 2.5
64-
3 2.5
65-
dtype: float64
6634

67-
Current Behavior
68-
69-
.. code-block:: ipython
70-
71-
In [4]: pd.Series([np.nan, np.nan, -np.inf, -np.inf]).rank(na_option='top')
72-
Out[4]:
73-
0 1.5
74-
1 1.5
75-
2 3.5
76-
3 3.5
77-
dtype: float64
78-
79-
Moreover, previously if you rank an array of `object` dtype, ``None`` values will have different ranks.
35+
.. _whatsnew_0220.enhancements.other:
8036

81-
Previously Behavior:
37+
Other Enhancements
38+
^^^^^^^^^^^^^^^^^^
8239

83-
.. code-block:: ipython
40+
- Better support for :func:`Dataframe.style.to_excel` output with the ``xlsxwriter`` engine. (:issue:`16149`)
41+
- :func:`pandas.tseries.frequencies.to_offset` now accepts leading '+' signs e.g. '+1h'. (:issue:`18171`)
42+
- :func:`MultiIndex.unique` now supports the ``level=`` argument, to get unique values from a specific index level (:issue:`17896`)
43+
- :class:`pandas.io.formats.style.Styler` now has method ``hide_index()`` to determine whether the index will be rendered in ouptut (:issue:`14194`)
44+
- :class:`pandas.io.formats.style.Styler` now has method ``hide_columns()`` to determine whether columns will be hidden in output (:issue:`14194`)
45+
- Improved wording of ``ValueError`` raised in :func:`to_datetime` when ``unit=`` is passed with a non-convertible value (:issue:`14350`)
46+
- :func:`Series.fillna` now accepts a Series or a dict as a ``value`` for a categorical dtype (:issue:`17033`)
47+
- :func:`pandas.read_clipboard` updated to use qtpy, falling back to PyQt5 and then PyQt4, adding compatibility with Python3 and multiple python-qt bindings (:issue:`17722`)
8448

85-
In [3]: pd.Series([None, None, None, 'A','B']).rank(na_option='top')
86-
Out[3]:
87-
0 3.0
88-
1 2.0
89-
2 1.0
90-
3 4.0
91-
4 5.0
92-
dtype: float64
49+
.. _whatsnew_0220.api_breaking:
9350

94-
Current Behavior
51+
Backwards incompatible API changes
52+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
9553

96-
.. code-block:: ipython
54+
- :func:`Series.fillna` now raises a ``TypeError`` instead of a ``ValueError`` when passed a list, tuple or DataFrame as a ``value`` (:issue:`18293`)
55+
- :func:`pandas.DataFrame.merge` no longer casts a ``float`` column to ``object`` when merging on ``int`` and ``float`` columns (:issue:`16572`)
56+
- The default NA value for :class:`UInt64Index` has changed from 0 to ``NaN``, which impacts methods that mask with NA, such as ``UInt64Index.where()`` (:issue:`18398`)
57+
-
9758

98-
In [3]: pd.Series([None, None, None, 'A','B']).rank(na_option='top')
99-
Out[3]:
100-
0 2.0
101-
1 2.0
102-
2 2.0
103-
3 4.0
104-
4 5.0
10559

106-
- Better support for ``Dataframe.style.to_excel()`` output with the ``xlsxwriter`` engine. (:issue:`16149`)
107-
-
108-
-
10960

110-
.. _whatsnew_0220.api_breaking:
11161

112-
Backwards incompatible API changes
113-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
11462

115-
-
116-
-
117-
-
11863

11964
.. _whatsnew_0220.api:
12065

12166
Other API Changes
12267
^^^^^^^^^^^^^^^^^
12368

69+
- :func:`Series.astype` and :func:`Index.astype` with an incompatible dtype will now raise a ``TypeError`` rather than a ``ValueError`` (:issue:`18231`)
70+
- ``Series`` construction with an ``object`` dtyped tz-aware datetime and ``dtype=object`` specified, will now return an ``object`` dtyped ``Series``, previously this would infer the datetime dtype (:issue:`18231`)
12471
- ``NaT`` division with :class:`datetime.timedelta` will now return ``NaN`` instead of raising (:issue:`17876`)
125-
- All-NaN levels in ``MultiIndex`` are now assigned float rather than object dtype, coherently with flat indexes (:issue:`17929`).
126-
- :class:`Timestamp` will no longer silently ignore unused or invalid `tz` or `tzinfo` keyword arguments (:issue:`17690`)
127-
- :class:`Timestamp` will no longer silently ignore invalid `freq` arguments (:issue:`5168`)
128-
- :class:`CacheableOffset` and :class:`WeekDay` are no longer available in the `tseries.offsets` module (:issue:`17830`)
72+
- All-NaN levels in a ``MultiIndex`` are now assigned ``float`` rather than ``object`` dtype, promoting consistency with ``Index`` (:issue:`17929`).
73+
- :class:`Timestamp` will no longer silently ignore unused or invalid ``tz`` or ``tzinfo`` keyword arguments (:issue:`17690`)
74+
- :class:`Timestamp` will no longer silently ignore invalid ``freq`` arguments (:issue:`5168`)
75+
- :class:`CacheableOffset` and :class:`WeekDay` are no longer available in the ``pandas.tseries.offsets`` module (:issue:`17830`)
76+
- `tseries.frequencies.get_freq_group()` and `tseries.frequencies.DAYS` are removed from the public API (:issue:`18034`)
77+
- :func:`Series.truncate` and :func:`DataFrame.truncate` will raise a ``ValueError`` if the index is not sorted instead of an unhelpful ``KeyError`` (:issue:`17935`)
78+
- :func:`Index.map` can now accept ``Series`` and dictionary input objects (:issue:`12756`).
79+
- :func:`Dataframe.unstack` will now default to filling with ``np.nan`` for ``object`` columns. (:issue:`12815`)
80+
- :class:`IntervalIndex` constructor will raise if the ``closed`` parameter conflicts with how the input data is inferred to be closed (:issue:`18421`)
81+
- Inserting missing values into indexes will work for all types of indexes and automatically insert the correct type of missing value (``NaN``, ``NaT``, etc.) regardless of the type passed in (:issue:`18295`)
82+
- Restricted ``DateOffset`` keyword arguments. Previously, ``DateOffset`` subclasses allowed arbitrary keyword arguments which could lead to unexpected behavior. Now, only valid arguments will be accepted. (:issue:`17176`, :issue:`18226`).
12983

13084
.. _whatsnew_0220.deprecations:
13185

13286
Deprecations
13387
~~~~~~~~~~~~
13488

135-
-
89+
- ``Series.from_array`` and ``SparseSeries.from_array`` are deprecated. Use the normal constructor ``Series(..)`` and ``SparseSeries(..)`` instead (:issue:`18213`).
13690
-
13791
-
13892

@@ -141,17 +95,25 @@ Deprecations
14195
Removal of prior version deprecations/changes
14296
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
14397

144-
-
145-
-
146-
-
98+
- Warnings against the obsolete usage ``Categorical(codes, categories)``, which were emitted for instance when the first two arguments to ``Categorical()`` had different dtypes, and recommended the use of ``Categorical.from_codes``, have now been removed (:issue:`8074`)
99+
- The ``levels`` and ``labels`` attributes of a ``MultiIndex`` can no longer be set directly (:issue:`4039`).
100+
- ``pd.tseries.util.pivot_annual`` has been removed (deprecated since v0.19). Use ``pivot_table`` instead (:issue:`18370`)
101+
- ``pd.tseries.util.isleapyear`` has been removed (deprecated since v0.19). Use ``.is_leap_year`` property in Datetime-likes instead (:issue:`18370`)
102+
- ``pd.ordered_merge`` has been removed (deprecated since v0.19). Use ``pd..merge_ordered`` instead (:issue:`18459`)
147103

148104
.. _whatsnew_0220.performance:
149105

150106
Performance Improvements
151107
~~~~~~~~~~~~~~~~~~~~~~~~
152108

153-
- Indexers on Series or DataFrame no longer create a reference cycle (:issue:`17956`)
154-
-
109+
- Indexers on ``Series`` or ``DataFrame`` no longer create a reference cycle (:issue:`17956`)
110+
- Added a keyword argument, ``cache``, to :func:`to_datetime` that improved the performance of converting duplicate datetime arguments (:issue:`11665`)
111+
- :class`DateOffset` arithmetic performance is improved (:issue:`18218`)
112+
- Converting a ``Series`` of ``Timedelta`` objects to days, seconds, etc... sped up through vectorization of underlying methods (:issue:`18092`)
113+
- Improved performance of ``.map()`` with a ``Series/dict`` input (:issue:`15081`)
114+
- The overriden ``Timedelta`` properties of days, seconds and microseconds have been removed, leveraging their built-in Python versions instead (:issue:`18242`)
115+
- ``Series`` construction will reduce the number of copies made of the input data in certain cases (:issue:`17449`)
116+
- Improved performance of :func:`Series.dt.date` and :func:`DatetimeIndex.date` (:issue:`18058`)
155117
-
156118

157119
.. _whatsnew_0220.docs:
@@ -168,27 +130,31 @@ Documentation Changes
168130
Bug Fixes
169131
~~~~~~~~~
170132

171-
- Bug in ``pd.read_msgpack()`` with a non existent file is passed in Python 2 (:issue:`15296`)
172-
- Bug in ``DataFrame.groupby`` where key as tuple in a ``MultiIndex`` were interpreted as a list of keys (:issue:`17979`)
173133

174134
Conversion
175135
^^^^^^^^^^
176136

177-
-
137+
- Bug in :class:`Index` constructor with `dtype='uint64'` where int-like floats were not coerced to :class:`UInt64Index` (:issue:`18400`)
178138
-
179139
-
180140

181141
Indexing
182142
^^^^^^^^
183143

184-
- Bug in :func:`PeriodIndex.truncate` which raises ``TypeError`` when ``PeriodIndex`` is monotonic (:issue:`17717`)
185-
-
144+
- Bug in :func:`Series.truncate` which raises ``TypeError`` with a monotonic ``PeriodIndex`` (:issue:`17717`)
145+
- Bug in :func:`DataFrame.groupby` where tuples were interpreted as lists of keys rather than as keys (:issue:`17979`, :issue:`18249`)
146+
- Bug in :func:`MultiIndex.remove_unused_levels`` which would fill nan values (:issue:`18417`)
147+
- Bug in :func:`MultiIndex.from_tuples`` which would fail to take zipped tuples in python3 (:issue:`18434`)
148+
- Bug in :class:`IntervalIndex` where empty and purely NA data was constructed inconsistently depending on the construction method (:issue:`18421`)
186149
-
187150

188151
I/O
189152
^^^
190153

191154
- :func:`read_html` now rewinds seekable IO objects after parse failure, before attempting to parse with a new parser. If a parser errors and the object is non-seekable, an informative error is raised suggesting the use of a different parser (:issue:`17975`)
155+
- Bug in :func:`read_msgpack` with a non existent file is passed in Python 2 (:issue:`15296`)
156+
- Bug in :func:`read_csv` where a ``MultiIndex`` with duplicate columns was not being mangled appropriately (:issue:`18062`)
157+
- Bug in :func:`read_sas` where a file with 0 variables gave an ``AttributeError`` incorrectly. Now it gives an ``EmptyDataError`` (:issue:`18184`)
192158
-
193159
-
194160

@@ -223,7 +189,7 @@ Reshaping
223189
Numeric
224190
^^^^^^^
225191

226-
- Bug in ``pd.Series.rank()`` and ``pd.DataFrame.rank() could not properly rank infinit values. Infinit values were assigned NaNs as ranks. If NaNs were present together with infinit values, the ranks were calculated wrong (:issue:`6945`)
192+
-
227193
-
228194
-
229195

@@ -237,6 +203,6 @@ Categorical
237203
Other
238204
^^^^^
239205

240-
-
241-
-
206+
- Improved error message when attempting to use a Python keyword as an identifier in a numexpr query (:issue:`18221`)
207+
- Fixed a bug where creating a Series from an array that contains both tz-naive and tz-aware values will result in a Series whose dtype is tz-aware instead of object (:issue:`16406`)
242208
-

0 commit comments

Comments
 (0)