BUG: Series/DataFrame.rank returns empty object on failure 

This was noticed when working on #40288

Example:

```
df = DataFrame({"A": 3 * [object]})
print(df.rank())
print(df.A.rank())
```

The last two lines are an empty DataFrame and empty Series respectively.

The docstring for numeric_only says:

> For DataFrame objects, rank only numeric columns if set to True.

The current behavior with `numeric_only=None` (the default value) is:

    Try with all columns. If a TypeError is raised, try with numeric_only=True.

When numeric_only is True and the Series/DataFrame contain no numeric columns, rank then operates on an empty object returning an empty result.

This is causing issues when rank is used in transform lists and dictionaries. Namely, we'd like to have partial-failure for TypeErrors, but rank is returning an empty result instead of raising a TypeError. This could be special-cased, but it seems to me that returning an empty object (at least when numeric_only is not True) is undesirable itself.

Some options I see are:

 - Remove None as being an option and replace the default numeric_only=None with numeric_only=False.
 - If numeric_only is None and the fallback of selecting only numeric columns results in an empty object, raise a TypeError.
 - If selecting only numeric columns results in an empty object (even when numeric_only=True), raise a TypeError.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Series/DataFrame.rank returns empty object on failure #40418

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

BUG: Series/DataFrame.rank returns empty object on failure #40418

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions