-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
DOC: update the pd.DataFrame.memory_usage/empty docstring(Seoul) #20102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 5 commits
3dff081
fc5b498
b033dc6
bb7f341
1585a0e
d4cc71d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1943,32 +1943,86 @@ def _sizeof_fmt(num, size_qualifier): | |
_put_lines(buf, lines) | ||
|
||
def memory_usage(self, index=True, deep=False): | ||
"""Memory usage of DataFrame columns. | ||
""" | ||
Return the memory usage of each column in bytes. | ||
|
||
The memory usage can optionally include the contribution of | ||
the index and elements of `object` dtype. | ||
|
||
A configuration option, `display.memory_usage` (see Parameters) | ||
|
||
Parameters | ||
---------- | ||
index : bool | ||
Specifies whether to include memory usage of DataFrame's | ||
index in returned Series. If `index=True` (default is False) | ||
the first index of the Series is `Index`. | ||
deep : bool | ||
Introspect the data deeply, interrogate | ||
`object` dtypes for system-level memory consumption | ||
index : bool, default True | ||
Specifies whether to include the memory usage of the DataFrame's | ||
index in returned Series. If ``index=True`` the memory usage of the | ||
index the first item in the output. | ||
deep : bool, default False | ||
If True, introspect the data deeply by interrogating | ||
`object` dtypes for system-level memory consumption, and include | ||
it in the returned values. | ||
|
||
Returns | ||
------- | ||
sizes : Series | ||
A series with column names as index and memory usage of | ||
columns with units of bytes. | ||
|
||
Notes | ||
----- | ||
Memory usage does not include memory consumed by elements that | ||
are not components of the array if deep=False | ||
A Series whose index is the original column names and whose values | ||
is the memory usage of each column in bytes. | ||
|
||
See Also | ||
-------- | ||
numpy.ndarray.nbytes | ||
numpy.ndarray.nbytes : Total bytes consumed by the elements of an | ||
ndarray. | ||
Series.memory_usage : Bytes consumed by a Series. | ||
pandas.Categorical : Memory-efficient array for string values with | ||
many repeated values. | ||
|
||
Examples | ||
-------- | ||
>>> dtypes = ['int64', 'float64', 'complex128', 'object', 'bool'] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add a categorical type here as well |
||
>>> data = dict([(t, np.ones(shape=5000).astype(t)) | ||
... for t in dtypes]) | ||
>>> df = pd.DataFrame(data) | ||
>>> df.head() | ||
int64 float64 complex128 object bool | ||
0 1 1.0 (1+0j) 1 True | ||
1 1 1.0 (1+0j) 1 True | ||
2 1 1.0 (1+0j) 1 True | ||
3 1 1.0 (1+0j) 1 True | ||
4 1 1.0 (1+0j) 1 True | ||
|
||
>>> df.memory_usage() | ||
Index 80 | ||
int64 40000 | ||
float64 40000 | ||
complex128 80000 | ||
object 40000 | ||
bool 5000 | ||
dtype: int64 | ||
|
||
>>> df.memory_usage(index=False) | ||
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not certain the two latter examples (with |
||
int64 40000 | ||
float64 40000 | ||
complex128 80000 | ||
object 40000 | ||
bool 5000 | ||
dtype: int64 | ||
|
||
The memory footprint of `object` dtype columns is ignored by default: | ||
|
||
>>> df.memory_usage(deep=True) | ||
Index 80 | ||
int64 40000 | ||
float64 40000 | ||
complex128 80000 | ||
object 160000 | ||
bool 5000 | ||
dtype: int64 | ||
|
||
Use a Categorical for efficient storage of an object-dtype column with | ||
many repeated values. | ||
|
||
>>> df['object'].astype('category').memory_usage(deep=True) | ||
5168 | ||
""" | ||
result = Series([c.memory_usage(index=False, deep=deep) | ||
for col, c in self.iteritems()], index=self.columns) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be missing something in this sentence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.