Skip to content

ENH: pd.DataFrame.from_dict() should support loading columns of varying lengths #61282

Open
@nikhilweee

Description

@nikhilweee

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

Creating a dataframe from a dictionary with columns of varying lengths is not supported.

As of pandas 2.2.3, the following snippet results in ValueError: All arrays must be of the same length

df = pd.DataFrame.from_dict({"col1": [1, 2, 3], "col2": [4, 5]})

Feature Description

Pandas should automatically pad columns as necessary to make sure they are the same length. Especially because that's the behavior when the orient argument is set to index. The following works perfectly fine.

df = pd.DataFrame.from_dict({"col1": [1, 2, 3], "col2": [4, 5]}, orient="index")

Alternative Solutions

Since pandas already supports rows of varying lengths when the orient argument is set to index, to load a dictionary where not all columns are the same length, an alternative solution would be to set orient to index and transpose the resulting dataframe.

df = pd.DataFrame.from_dict({"col1": [1, 2, 3], "col2": [4, 5]}, orient='index').T

Additional Context

Since there is a discrepancy in the way pandas handles loading dictionaries based on the value of the orient argument, it would be great to have parity between the two.

Metadata

Metadata

Assignees

Labels

EnhancementNeeds TriageIssue that has not been reviewed by a pandas team member

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions