-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
PERF: Optimize read_excel nrows #46894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
6fd0422
Check for nrows in read_excel
ahawryluk e4d52d8
Add nrows tests for multiindex and skiprows
ahawryluk eb83a74
code linting and fix a bug in a test
ahawryluk 2c643c7
What's new entry
ahawryluk dc14ac2
Add asv test with nrows=10
ahawryluk 2e79141
Parametrize new tests
ahawryluk 943f866
Docstrings and type hints
ahawryluk 10a8a3e
Add PR #
ahawryluk 30a280c
Use is_integer and validate_integer
ahawryluk bbc1f1d
Consolidate header arg validation
ahawryluk 2018d26
Attempting to placate mypy with assert statements
ahawryluk c108a97
make type checks pass with typing.cast
ahawryluk 3aad835
Do not allow sets for header arg
ahawryluk 7c913d6
Merge branch 'main' into opt_excel_nrows
ahawryluk d6e1df3
Two more allow_set=False
ahawryluk File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -120,13 +120,7 @@ def __init__(self, kwds) -> None: | |
|
||
# validate header options for mi | ||
self.header = kwds.get("header") | ||
if isinstance(self.header, (list, tuple, np.ndarray)): | ||
if not all(map(is_integer, self.header)): | ||
raise ValueError("header must be integer or list of integers") | ||
if any(i < 0 for i in self.header): | ||
raise ValueError( | ||
"cannot specify multi-index header with negative integers" | ||
) | ||
Comment on lines
-123
to
-129
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I might be missing it, is validate_header_arg called somewhere instead? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, but it took me a while to find it. It's called earlier in TextFileReader:
|
||
if is_list_like(self.header, allow_sets=False): | ||
if kwds.get("usecols"): | ||
raise ValueError( | ||
"cannot specify usecols when specifying a multi-index header" | ||
|
@@ -138,31 +132,20 @@ def __init__(self, kwds) -> None: | |
|
||
# validate index_col that only contains integers | ||
if self.index_col is not None: | ||
is_sequence = isinstance(self.index_col, (list, tuple, np.ndarray)) | ||
if not ( | ||
is_sequence | ||
is_list_like(self.index_col, allow_sets=False) | ||
jreback marked this conversation as resolved.
Show resolved
Hide resolved
|
||
and all(map(is_integer, self.index_col)) | ||
or is_integer(self.index_col) | ||
): | ||
raise ValueError( | ||
"index_col must only contain row numbers " | ||
"when specifying a multi-index header" | ||
) | ||
elif self.header is not None: | ||
elif self.header is not None and self.prefix is not None: | ||
# GH 27394 | ||
if self.prefix is not None: | ||
raise ValueError( | ||
"Argument prefix must be None if argument header is not None" | ||
) | ||
# GH 16338 | ||
elif not is_integer(self.header): | ||
raise ValueError("header must be integer or list of integers") | ||
# GH 27779 | ||
elif self.header < 0: | ||
raise ValueError( | ||
"Passing negative integer to header is invalid. " | ||
"For no header, use header=None instead" | ||
) | ||
raise ValueError( | ||
"Argument prefix must be None if argument header is not None" | ||
) | ||
|
||
self._name_processed = False | ||
|
||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.