Dataframe namespaces

In #10, it's been discussed that it would be convenient if the dataframe API allows method chaining. For example:
```python
import pandas

(pandas.read_csv('countries.csv')
       .rename(columns={'name': 'country'})
       .assign(area_km2=lambda df: df['area_m2'].astype(float) / 100)
       .query('(continent.str.lower() != "antarctica") | (population < area_km2)'))
```
This implies that most functionality is implemented as methods of the dataframe class. Based on pandas, the number of methods can be 300 or more, so it may be problematic to implement everything in the same namespace. pandas uses a mixed approach, with different techniques to try to organize the API.

# Approaches

## Top-level methods
```
df.sum()
df.astype()
```
Many of the methods are simply implemented directly as methods of dataframe.

## Prefixed methods
```python
df.to_csv()
df.to_parquet()
```
Some of the methods are grouped with a common prefix.

## Accessors
```python
df.str.lower()
df.dt.hour()
```
Accessors are a property of dataframe (or series, but assuming only one dataframe class for simplicity) that groups some methods under it.

## Functions
```python
pandas.wide_to_long(df)
pandas.melt(df)
```
In some cases, functions are used instead of methods.

## Functional API
```python
df.apply(func)
df.applymap(func)
```
pandas also provides a more functional API, where functions can be passed as parameters

# Standard API

I guess we will agree, that a uniform and consistent API would be better for the standard. That should make things easier to implement, and also a more intuitive experience for the user.

Also, I think it would be good that the API can be extended easily. Couple of example of how pandas can be extended with custom functions:
```python
@pd.api.extensions.register_dataframe_accessor('my_accessor')
class MyAccessor:
    def my_custom_method(self):
        return True

df.my_accessor.my_custom_method()
```
```python
df.apply(my_custom_function)
df.apply(numpy.sum)
```

Conceptually, I think there are some methods that should go together, more than by topic, by the API they follow. The clearest example is reductions, and there was some discussion in https://github.com/pydata-apis/dataframe-api/issues/11#issuecomment-644115670.

I think no solution will be perfect, and the options that we have are (feel free to add to the list if I'm missing any option worth considering):

## Top-level methods
```
df.sum()
```

## Prefixed methods
```python
df.reduce_sum()
```

## Accessors
```python
df.reduce.sum()
```

## Functions
```python
mod.reductions.sum(df)
```
_mod represents the implementation module (e.g. `pandas`)_

## Functional API
```python
df.reduce(mod.reductions.sum)
```

Personally, my preference is the functional API. I think it's the simplest that keeps things organized, and the simplest to extend. The main drawback is its readability, it may be too verbose. There is the option to allow using a string instead of the function for known functions (e.g. `df.reduce('sum')`).

Thoughts? Other ideas?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataframe namespaces #23

Approaches

Top-level methods

Prefixed methods

Accessors

Functions

Functional API

Standard API

Top-level methods

Prefixed methods

Accessors

Functions

Functional API

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Dataframe namespaces #23

Description

Approaches

Top-level methods

Prefixed methods

Accessors

Functions

Functional API

Standard API

Top-level methods

Prefixed methods

Accessors

Functions

Functional API

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions