Open
Description
Research
-
I have searched the [pandas] tag on StackOverflow for similar questions.
-
I have asked my usage related question on StackOverflow.
Link to question on StackOverflow
Question about pandas
I have this code intended to count occurrences in 30-minute intervals; the requirement is to have these intervals at fixed starting points, minute 00 and minute 30 of each hour. Regretfully, despite every attempt of mine, the second group is aligned to minute 03 and minute 33.
I suspect that both the groups are aligned with the first time row and that the first one is correct just by chance. How can I tell the grouper to force the alignment to minutes 00 and 30?
# load and prepare data
df_long_forecast = pd.read_csv('df_long_forecast - reduced.csv')
df_long_forecast['after_12_max_datetime'] = pd.to_datetime(df_long_forecast['after_12_max_datetime'])
df_long_forecast['after_12_max_time'] = (df_long_forecast['after_12_max_datetime'] - df_long_forecast['after_12_max_datetime'].dt.normalize()) # timedelta64[ns]
# count the number of maxes happening after 12am (absolute and percentage)
hist_max = df_long_forecast.groupby(pd.Grouper(key='after_12_max_time', freq='30T', offset='0T', origin=origin))['Date'].count()
percent_max = round(hist_max / hist_max.sum() * 100, 2)
display(hist_max)
# count the number of maxes after 12 that result in a profit trade (cannot be MORE than the previous ones)
df_long_forecast_profit = df_long_forecast[ df_long_forecast['after_12_max_>_9_to_12_high'] > 0 ]
profit_long = df_long_forecast_profit.groupby(pd.Grouper(key='after_12_max_time', freq='30T', offset='0T', origin=origin))['Date'].count()
percent_profit_long = round(profit_long / hist_max.sum() * 100, 2)
display(profit_long)
Here is the print of the hist_max df
after_12_max_time
0 days 12:00:00 24
0 days 12:30:00 5
0 days 13:00:00 7
0 days 13:30:00 5
0 days 14:00:00 5
0 days 14:30:00 4
0 days 15:00:00 4
0 days 15:30:00 1
0 days 16:00:00 5
0 days 16:30:00 7
0 days 17:00:00 1
0 days 17:30:00 6
0 days 18:00:00 1
0 days 18:30:00 1
0 days 19:00:00 1
0 days 19:30:00 6
0 days 20:00:00 3
0 days 20:30:00 0
0 days 21:00:00 6
0 days 21:30:00 19
0 days 22:00:00 8
Freq: 30T, Name: Date, dtype: int64
and this is profit_long
after_12_max_time
0 days 12:03:00 8
0 days 12:33:00 4
0 days 13:03:00 5
0 days 13:33:00 4
0 days 14:03:00 5
0 days 14:33:00 4
0 days 15:03:00 3
0 days 15:33:00 2
0 days 16:03:00 5
0 days 16:33:00 6
0 days 17:03:00 2
0 days 17:33:00 5
0 days 18:03:00 1
0 days 18:33:00 2
0 days 19:03:00 0
0 days 19:33:00 5
0 days 20:03:00 3
0 days 20:33:00 0
0 days 21:03:00 6
0 days 21:33:00 21
0 days 22:03:00 3
Freq: 30T, Name: Date, dtype: int64
The CSV file with just the pertinent columns can be downloaded from this link.