Description
Add the following function to pd. dataFrame and pd.Series
def ends(df, x=5):
"""Returns both head and tail of the dataframe or series.
Args:
x (int): Optional number of rows to return for each head and tail
"""
print('{} rows x {} columns'.format(np.shape(df)[0],np.shape(df)[1]))
return df.head(x).append(df.tail(x))
Problem description
Often both the beginning and end of a df are of interest, fore example in a time series.
This leads to calling df.head() df.tail() in two seperate notebook cells. This is not only tedious, but also leads to a cluttered notebook. A function that returns both of these + a print on the number of rows and columns, thus allowing a check if the index matches the number of rows.
Example
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(1500,6))
print(ends(df,2))
1500 rows x 6 columns
0 | 1 | 2 | 3 | 4 | 5 | |
---|---|---|---|---|---|---|
0 | 0.949695 | 0.160928 | 0.434134 | 0.943103 | 0.477830 | 0.903479 |
1 | 0.736711 | 0.103746 | 0.028694 | 0.205910 | 0.226061 | 0.458452 |
1498 | 0.362950 | 0.586887 | 0.399681 | 0.115366 | 0.239049 | 0.386281 |
1499 | 0.018102 | 0.852198 | 0.880993 | 0.671604 | 0.705586 | 0.802237 |
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.2.final.0
python-bits: 64
OS: Windows
OS-release: 8.1
machine: AMD64
processor: Intel64 Family 6 Model 61 Stepping 4, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.21.0
pytest: 3.3.1
pip: 9.0.1
setuptools: 38.2.4
Cython: 0.26.1
numpy: 1.12.1
scipy: 0.19.1
pyarrow: None
xarray: None
IPython: 6.1.0
sphinx: 1.6.3
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.4.8
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 0.9.8
lxml: 3.8.0
bs4: 4.6.0
html5lib: 0.999999999
sqlalchemy: 1.1.13
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None