Skip to content

ENH: Display DataFrame column dtypes (data types) for each variable name when printing #47604

Open
@coatless

Description

@coatless

Is your feature request related to a problem?

I wish I could incorporate the data type of each column into the output of the Pandas dataframe when using any print method (with a preference for .to_html()). The feature could contain:

  • show_dtypes bool, default False
    • Display DataFrame column dtypes (data types) for each variable name.

Describe the solution you'd like

The current way of displaying Pandas data with HTML gives:

original-pandas-to-html-output

The goal here would be to add a subheader row that incorporates the data type of each column variable from df.dtypes output, e.g.

additional-subheader-containing-variable-dtype-information

API breaking implications

If the feature is turned on by default, then any unit tests that assume a specific output feature would break as a new header row would be required.

Describe alternatives you've considered

After the HTML object was saved as a string, I used an adhoc method to add an additional subheader with beautiful soup. The method is a bit cumbersome compared to adding a slight modification into pandas/io/formats/html.py

Additional context

To re-create the screenshot above, please run:

import random
import pandas as pd
import numpy as np
from IPython.core.display import HTML
from bs4 import BeautifulSoup

# Simulate data
mu_o = random.choice([19.7,19.8,19.9,20,20.1,20.2])
std_gen = random.choice([1.3, 1.4, 1.5])
sample_size = random.randint(8, 11)
sample_data =  np.round(np.random.normal(mu_o, std_gen, sample_size), 1)

# Convert data set to pandas
df = pd.DataFrame(data = {
    "id": list(range(1, sample_size + 1)),
    "type": random.choices(["good","okay","bad", "horrible"], k = sample_size),
    "vol": sample_data})

# Name columns
df.columns = ['Subject', 'Type', 'Volume (oz)']

# Adhoc way of creating the desired subheader data
html_subheader_dtype = []
for value in df.dtypes.to_dict().values():
  html_subheader_dtype.append(f'<th> <span style="color:grey;text-align:left;font-size:.75em"> ({value}) </span> </th>')

# Convert to HTML
html_df = df.to_html(index = False, justify = "left")

# Use BeautifulSoup 4 to navigate dom
soup = BeautifulSoup(html_df, "html.parser")

# Create new tag for a table row
subheader_dtype_row = soup.new_tag("tr")

# Format with beautiful soup
row_dtype_data = BeautifulSoup("".join(str(x) for x in html_subheader_dtype), "html.parser")

# Add new BeautifulSoup object into the subheader element
subheader_dtype_row.contents.append(row_dtype_data)
  
# Adding the subheader into the table object
soup.table.thead.append(subheader_dtype_row)

# Convert from bs4.BeautifulSoup to str
modified_df_html = str(soup)

# Display Table
HTML(modified_df_html)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions