Skip to content

nth() mixes column order #20760

Closed
Closed
@sursu

Description

@sursu

Consider the following dataframe:

df = pd.DataFrame([[179293473,'2016-06-01 00:00:03.549745','http://www.dr.dk/nyheder/',39169523],[179293473,'2016-06-01 00:04:22.346018','https://www.information.dk/indland/2016/05/hvert-tredje-offer-naar-anmelde-voldtaegt-tide', 39125224],
 [179773461, '2016-06-01 22:13:16.588146', 'https://www.google.dk', 31658124],
 [179773461, '2016-06-01 22:14:04.059781', 'https://www.google.dk', 31658124],
 [179773461, '2016-06-01 22:16:37.230587', np.nan, 31658124],
 [179773461, '2016-06-01 22:23:09.847149', 'https://www.google.dk', 32718401],
 [179773461, '2016-06-01 22:23:55.158929', np.nan, 32718401],
 [179773461, '2016-06-01 22:27:00.857224', np.nan, 32718401]],
columns=['SessionID', 'PageTime', 'ReferrerURL', 'PageID'])

which looks like this:

 SessionID PageTime ReferrerURL PageID
179293473 2016-06-01 00:00:03.549745 http://www.dr.dk/nyheder/ 39169523
179293473 2016-06-01 00:04:22.346018 https://www.information.dk/ 39125224
179773461 2016-06-01 22:13:16.588146 https://www.google.dk 31658124
179773461 2016-06-01 22:14:04.059781 https://www.google.dk 31658124
179773461 2016-06-01 22:16:37.230587 NaN 31658124
179773461 2016-06-01 22:23:09.847149 https://www.google.dk 32718401
179773461 2016-06-01 22:23:55.158929 NaN 32718401
179773461 2016-06-01 22:27:00.857224 NaN 32718401

Run:
df.groupby('SessionID').nth(-1)

Out:

 SessionID PageID PageTime ReferrerURL
179293473 39125224 2016-06-01 00:04:22.346018 https://www.information.dk/
179773461 32718401 2016-06-01 22:27:00.857224 NaN

Question: Why has nth() mixed the order of my columns?

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugGroupbyReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions