Description
Something like this is a pretty good example. I know something about animals, cats, dogs, and hair, so I can sort of keep the structure of the data in my head and follow along with the transformations without having to scroll up and down to check against the original dataframe.
But a lot of examples just use meaningless column names like A, B, C, D...
or foo, bar, baz...
, which makes it a lot harder to gain an intuition about what's going on. For example, if you don't know what groupby
does, this example:
In [13]: df2 = pd.DataFrame({'X' : ['B', 'B', 'A', 'A'], 'Y' : [1, 2, 3, 4]})
In [14]: df2.groupby(['X']).sum()
Out[14]:
Y
X
A 7
B 3
might be less useful than this version:
In [13]: pets = pd.DataFrame({'animal' : ['dog', 'dog', 'cat', 'cat'], 'weight' : [10, 20, 8, 9]})
In [14]: pets.groupby(['weight']).mean()
Out[14]:
weight
animal
dog 15
cat 8.5
I realize re-doing all the examples like this would be a significant amount of work, but if there's agreement that this is a desirable thing, I'd be happy to kick things off with a small P.R.
Also, I think it would be good to add this as a guideline to the documentation section of the contributing doc. (Again, if people agree this is worthwhile and not misguided.)