Skip to content

DOC: Comparing .loc/.iloc to tuples and chained indexing #60632

Open
@joansigh

Description

@joansigh

Pandas version checks

  • I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#indexing-view-versus-copy

Documentation problem

import pandas as pd

# Creating a DataFrame with some sample data
data = {
    'Name': ['Jason', 'Emma', 'Alex', 'Sarah'],
    'Age': [28, 24, 32, 27],
    'City': ['New York', 'London', 'Paris', 'Tokyo'],
    'Salary': [75000, 65000, 85000, 70000]
}

df = pd.DataFrame(data)

# Display the DataFrame
print(df)

I want to update Jason’s age, and I do so with 

df['Age'][df['Name'] == 'Jason'] = 29


For code such as the code shown above, the df may or may not be update Jason's age to 29 due to the chained indexing that is being used.

The documentation mentions how .iloc/.loc is a better option. For example, something such as the following.

df.loc[df['Name'] == 'Jason', 'Age'] = 29

However it is not clear about best practices regarding tuples, such as the following.

df[('Age', df['Name'] == 'Jason')] = 29

Suggested fix for documentation

The suggested fix is to explain how the use of tuples would compare to the use of .iloc/.loc and the use of chained indexing in the context of best practices in pandas. Considerations can include time complexity, space complexity, code readability, etc.

Metadata

Metadata

Assignees

No one assigned

    Labels

    DocsIndexingRelated to indexing on series/frames, not to indexes themselvesNeeds DiscussionRequires discussion from core team before further action

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions