Closed
Description
!head 001EC00CC49D_processed
timestamp,frequency,voltage,active_power,energy,cost,current,reactive_power,apparent_power,power_factor,phase_angle
1369210417,49.99,234.187,1.138,0.131,0.000,0.014,3.036,3.242,0.351,69.458
1369210418,49.98,234.276,1.043,0.131,0.000,0.014,3.183,3.350,0.311,71.855
1369210419,49.97,234.306,1.043,0.132,0.000,0.014,3.100,3.271,0.319,71.411
1369210420,49.97,234.288,1.045,0.132,0.000,0.014,3.155,3.324,0.315,71.668
1369210421,49.98,234.330,1.047,0.133,0.000,0.014,3.099,3.271,0.320,71.332
1369210422,49.97,234.228,1.004,0.134,0.000,0.014,3.140,3.296,0.304,72.275
1369210423,49.99,234.260,1.036,0.134,0.000,0.013,2.986,3.161,0.328,70.861
1369210424,49.97,234.292,1.073,0.135,0.000,0.014,3.089,3.270,0.328,70.842
1369210425,49.98,234.259,1.073,0.135,0.000,0.014,3.089,3.270,0.328,70.839
def epoch_to_date(timestamp):
return datetime.datetime.fromtimestamp(int(timestamp))
def epoch_to_date_2(timestamp):
return pd.to_datetime(float(timestamp)*int(1e9))
df=pd.read_csv('001EC00CC49D_processed',parse_dates=[0],date_parser=epoch_to_date,index_col=0,error_bad_lines=False);
df2=pd.read_csv('001EC00CC49D_processed',parse_dates=[0],date_parser=epoch_to_date_2,index_col=0,error_bad_lines=False);
df
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 327818 entries, 2013-05-22 13:43:37 to 2013-06-05 12:22:16
Data columns (total 10 columns):
frequency 327810 non-null values
...
dtypes: float64(9), object(1)
df2
<class 'pandas.core.frame.DataFrame'>
Index: 327818 entries, 1.369210417e+18 to 1.370415136e+18
Data columns (total 10 columns):
frequency 327810 non-null values
....
dtypes: float64(9), object(1)
Using pd.to_datetime() took more time and also index is not DateTime
Moreover, since a lot of people tend to record unix timestamps in order to escape different time formats, might be handy to add this functionality inbuilt.