Closed
Description
Is your feature request related to a problem?
The natural sort order is a common use case when working with real-world data. For example, consider the following DataFrame of clinical data where the body temperature of patients was measured:
data = {'Patient_ID': {0: 'ID-1',
1: 'ID-11',
2: 'ID-2'},
'temperature': {0: 37.2, 1: 37.5, 2: 37.2}}
df = pd.DataFrame(data).sort_values(by=['Patient_ID'])
df.head(5)
will yield:
Patient_ID | temperature | |
---|---|---|
0 | ID-1 | 37.2 |
1 | ID-11 | 37.5 |
2 | ID-2 | 37.2 |
whereas we would want
Patient_ID | temperature | |
---|---|---|
0 | ID-1 | 37.2 |
2 | ID-2 | 37.2 |
1 | ID-11 | 37.5 |
Describe the solution you'd like
- sort_values could get a new parameter
sort_order
that is by default alphabetical and could be switched to natural. - the implementation could be similar to the natsort package without any of the extra options
natsort
brings.: modify all values and pass them to np.argsort() s.t. then transform them back.
API breaking implications
Since we are only adding a parameter this would not break any existing API.
Describe alternatives you've considered
Currently, one could use the natsort
package. However, this seems cumbersome for such a common operation and makes it necessary to reindex the DataFrame. Stackoverflow example.