Closed
Description
The following code is very slow:
import pandas as pd
import numpy as np
dt = 4.8000000418824129e-08
data = np.random.rand(1000000, 4)
df = pd.DataFrame(data, columns=list("ABCD"))
df.index *= dt
print(df.loc[0:0.001].shape)
after debug it, I found Float64Engine.get_loc()
is slow. Here is a demo:
import pandas as pd
import numpy as np
a = np.arange(1000000)
ind1 = pd.Float64Index(a * 4.8e-08)
ind2 = pd.Float64Index(a * 4.8000000418824129e-08)
%time ind1._engine.get_loc(0)
%time ind2._engine.get_loc(0)
outputs:
Wall time: 295 ms
Wall time: 9.9 s