Closed

Description
Here are all the vbenches which differ by more then 15 percent,
best of 5.
I hope to have this automated and realtime by the time the
next release comes around, along with bisection.
1st run
λ cat r1/report.txt
Worse
getattr_dataframe_index 2.000000
frame_multi_and_no_ne 1.319728
series_constructor_ndarray 1.329545
ctor_index_array_string 1.651376
frame_wide_repr 48.154345
groupby_sum_booleans 1.152285
indexing_dataframe_boolean_rows 1.168337
series_getitem_scalar 1.862069
dataframe_getitem_scalar 1.214286
datamatrix_getitem_scalar 1.190476
concat_small_frames 1.159975
series_align_left_monotonic 1.289482
reindex_daterange_backfill 1.190402
reindex_daterange_pad 1.185000
timeseries_large_lookup_value 235.037234
dtype: float64
Better
frame_multi_and_st 0.580615
frame_multi_and 0.597748
frame_fancy_lookup 0.792336
frame_get_dtype_counts 0.000404
frame_fancy_lookup_all 0.797565
series_string_vector_slice 0.801726
frame_reindex_upcast 0.523595
frame_reindex_axis0 0.509034
groupby_first_float32 0.043276
groupby_last_float32 0.044075
groupby_transform 0.413400
indexing_dataframe_boolean_st 0.094886
indexing_dataframe_boolean 0.095269
frame_to_csv 0.737868
frame_to_csv2 0.121081
frame_to_csv_mixed 0.381196
write_csv_standard 0.198206
append_frame_single_mixed 0.805745
reindex_frame_level_align 0.790248
reindex_frame_level_reindex 0.787495
2nd run
Worse
frame_multi_and_no_ne 1.322615
series_constructor_ndarray 1.284091
ctor_index_array_string 1.557522
frame_wide_repr 47.085119
indexing_dataframe_boolean_rows 1.158785
series_getitem_scalar 1.896552
dataframe_getitem_scalar 1.214286
datamatrix_getitem_scalar 1.190476
series_align_left_monotonic 1.293073
reindex_daterange_backfill 1.192585
reindex_daterange_pad 1.176850
frame_reindex_columns 1.231402
timeseries_large_lookup_value 370.207254
dtype: float64
Better
frame_multi_and_st 0.584512
frame_multi_and 0.592406
frame_fancy_lookup 0.794964
frame_get_dtype_counts 0.000393
series_string_vector_slice 0.793987
frame_reindex_upcast 0.562601
frame_reindex_axis0 0.512411
groupby_first_float32 0.042893
groupby_last_float32 0.043322
groupby_transform 0.412138
indexing_dataframe_boolean_st 0.093816
indexing_dataframe_boolean 0.094429
frame_to_csv 0.721259
frame_to_csv2 0.120910
frame_to_csv_mixed 0.399276
write_csv_standard 0.197096
append_frame_single_mixed 0.795502
reindex_frame_level_align 0.787276
reindex_frame_level_reindex 0.786307
dtype: float64
Until test_perf gets validated in it's compare mode, re instability
#!/bin/bash
# profile current HEAD, against the commit
# specified on the command line
# assume you're running in a venv, and
# that upstream pandas is a git remote named
# "upstream"
PREV_VER=$1
THRESH=0.15
NITER=5
UPSTREAM=upstream/master
git reset --hard $UPSTREAM
H1=$(git log --format="%h" -1)
python setup.py develop
./test_perf.sh -H -N $NITER -c 1 -d $PWD/HEAD-$H1.pickle "$2"
git reset --hard $PREV_VER
H2=$(git log --format="%h" -1)
git checkout upstream/master vb_suite setup.py # bring back the updated suite and test_perf, and build cache
python setup.py develop
./test_perf.sh -H -N $NITER -c 1 -d $PWD/PREV-$H2.pickle "$2"
# back to master
git reset --hard $UPSTREAM
python setup.py develop
SCR=$(tee <<EOF
import pandas as pd
H=(pd.load("HEAD-$H1.pickle").min(1)/pd.load("PREV-$H2.pickle").min(1))
print "Worse"
print H[(H-1)>$THRESH]
print "\nBetter"
print H[(1-H)>$THRESH]
EOF
)
python -c "$SCR"
Metadata
Metadata
Assignees
Labels
No labels