Skip to content

BUG: DataFrame.shift with axis=1 shifts object columns to the next column with object dtype #26929

Closed
@nchrisr

Description

@nchrisr

Code Sample

import pandas as pd
import StringIO

# Store the csv string in a variable 
gps_string = """
"2010-01-12 18:00:00","$GPGGA","180439","7249.2150","N","11754.4238","W","2","10","0.9","-8.1","M","-12.4","M","","*57","","",""
"2010-01-12 17:30:00","$GPGGA","173439","7249.2160","N","11754.4233","W","2","11","0.8","-4.5","M","-12.4","M","","*5B","","",""
"2010-01-12 17:00:00","$GPGGA","170439","7249.2152","N","11754.4235","W","2","11","0.8","-3.1","M","-12.4","M","","*5C","","",""
"2010-01-12 16:30:00","N","11754.4210","W","2","09","1.1","-13.1","M","-12.4","M","","*6C","","","","","",""
"2010-01-12 16:00:00","N","11754.4229","W","2","10","0.9","-2.9","M","-12.4","M","","*53","","","","","",""
"2010-01-12 15:30:00","N","11754.4269","W","2","09","0.8","-4.3","M","-12.4","M","","*54","","","","","",""
"2010-01-12 15:00:00","N","11754.4267","W","2","10","0.8","-1.6","M","-12.4","M","","*56","","","","","",""
"2010-01-12 14:30:00","$GPGGA","143439","7249.2152","N","11754.4253","W","2","11","0.7","-4.3","M","-12.4","M","","*56","","",""
"2010-01-12 14:00:00","N","11754.4245","W","2","10","0.9","-7.0","M","-12.4","M","","*50","","","","","",""
"2010-01-12 13:30:00","$GPGGA","133439","7249.2134","N","11754.4243","W","2","11","0.7","-10.7","M","-12.4","M","","*61","","",""
"2010-01-12 13:00:00","N","11754.4245","W","2","10","0.8","-5.5","M","-12.4","M","","*56","","","","","",""
"2010-01-12 12:30:00","N","11754.4226","W","2","10","0.9","-7.1","M","-12.4","M","","*59","","","","","",""
"2010-01-12 12:00:00","N","11754.4238","W","2","10","0.8","-6.5","M","-12.4","M","","*51","","","","","",""
"2010-01-12 11:30:00","N","11754.4227","W","2","10","0.8","0.1","M","-12.4","M","","*73","","","","","",""
"2010-01-12 11:00:00","-7.4","M","-12.4","M","","*5F","","","","","","","","","","","",""
"2010-01-12 10:30:00","N","11754.4271","W","2","08","1.1","-8.4","M","-12.4","M","","*5A","","","","","","" """

# Read the csv string into a dataframe.
gps_df = pd.read_csv(StringIO.StringIO(gps_string), header=None, index_col=0)
rows_to_shift = gps_df[gps_df[15].isnull()].index
gps_df.loc[rows_to_shift] = gps_df.loc[rows_to_shift].shift(periods=1, axis=1)
# create this file to see the outcome
gps_df.to_csv("f.csv") 

Problem description

I am trying to use the Dataframe. shift() function to move certain rows of data into their correct columns, and the Dataframe.shift() function is doing some weird things, it is creating an empty column of null(s) and moves one of the columns to the end of the dataframe.

Screenshots

Original data:

Screen Shot 2019-06-18 at 4 09 22 PM

Output after execution of the code

Screen Shot 2019-06-18 at 4 31 15 PM

As seen above, the data that was originally in column 10 has been move to column 15 for some reason. Also a column with null values has been created in column 6.

I expect that the data should be moved to the right by one step, that is the data in each column should move to the column to the left of it, and the current behaviour is confusing, based on the documentation of what this function should do.

My expected output:

Screen Shot 2019-06-18 at 4 16 40 PM

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.14.final.0
python-bits: 32
OS: Windows

Metadata

Metadata

Assignees

No one assigned

    Labels

    AlgosNon-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diffBugMulti-BlockIssues caused by the presence of multiple Blocks

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions