-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Clarified error in read_sas method when buffer object provided withou… #14947
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -31,6 +31,9 @@ def read_sas(filepath_or_buffer, format=None, index=None, encoding=None, | |||
""" | |||
|
|||
if format is None: | |||
bufferr = "Format unrecognized. If buffer object, specify format") | |||
if type(filesize_or_buffer) != str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use not isinstance(filesize_or_buffer, compat.string_types)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bufferr -> buffer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bufferr -- buffer error. Didn't want to conflict with anything downstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compat.string_types doesn't appear to be a standard library item? Instead using:
if type(filepath_or_buffer) not in [str, unicode]:
raise TypeError(BuffErr)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from pandas import compat
pls do it this way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do, thanks for the pointer.
@@ -271,6 +271,7 @@ Performance Improvements | |||
- Improved performance of ``pd.wide_to_long()`` (:issue:`14779`) | |||
- Increased performance of ``pd.factorize()`` by releasing the GIL with ``object`` dtype when inferred as strings (:issue:`14859`) | |||
|
|||
- When reading buffer object in ``read_sas()`` method without specified format, filepath string is inferred rather than buffer object. Error is now thrown if buffer object is provided without format {sas7bdat|xport} specification. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just the first sentence.
put in bug fix section; use this PR number as the issue number.
please add a test (you can use |
This is my first PR and I don't do much dev--can you expand a bit on the
test you have in mind?
…On Dec 21, 2016 17:33, "Jeff Reback" ***@***.***> wrote:
please add a test (you can use StringIO() as a test buffer)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#14947 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABo9lMcvwwYeuxTKQnAGkCU1srG1qdToks5rKakcgaJpZM4LTa7o>
.
|
The test should basically be a simplified version of your code that discovered the problem. The test should fail without your fix, and pass with it. Maybe something like def test_sas_buffer_format(self):
b = StringIO("")
with self.assertRaises(TypeError):
result = pd.read_sas(b) Haven't tested that, so double check. That can go with the other tests in |
Got it. Much obliged.
…On Dec 21, 2016 17:38, "Tom Augspurger" ***@***.***> wrote:
The test should basically be a simplified version of your code that
discovered the problem. The test should fail without your fix, and pass
with it.
Maybe something like
def test_sas_buffer_format(self):
b = StringIO("")
with self.assertRaises(TypeError):
result = pd.read_sas(b)
Haven't tested that, so double check. That can go with the other tests in
pandas/io/tests/sas/test_sas.py
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#14947 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABo9lNMXg-se9IxqfkPH07Hk7oCN8DjAks5rKapNgaJpZM4LTa7o>
.
|
can you update |
Updated |
@@ -271,6 +271,7 @@ Performance Improvements | |||
- Improved performance of ``pd.wide_to_long()`` (:issue:`14779`) | |||
- Increased performance of ``pd.factorize()`` by releasing the GIL with ``object`` dtype when inferred as strings (:issue:`14859`) | |||
|
|||
- When reading buffer object in ``read_sas()`` method without specified format, filepath string is inferred rather than buffer object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug when reading a buffer object in pd.read_sas()
, without a specified format, a filepath string was inferred rather than buffer object.
move to Bug Fix section, add the issue number (this PR) at the end
if format is None: | ||
buffErr = "Format unrecognized. If buffer object, specify format" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use buffer_error_msg.
If this is a buffer object, rather than a string name, you must specify a format string.
@@ -29,8 +29,11 @@ def read_sas(filepath_or_buffer, format=None, index=None, encoding=None, | |||
DataFrame if iterator=False and chunksize=None, else SAS7BDATReader | |||
or XportReader | |||
""" | |||
|
|||
from pandas import compat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import should be at the top of the file (if might be there already)
|
||
class TestSasBuff(tm.TestCase): | ||
def test_sas_buffer_format(self): | ||
import StringIO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imports go at the top
def test_sas_buffer_format(self): | ||
import StringIO | ||
from pandas.io.sas.sasreader import read_sas | ||
b = StringIO.StringIO("") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from pandas import read_sas
def test_sas_buffer_format(self): | ||
import StringIO | ||
from pandas.io.sas.sasreader import read_sas | ||
b = StringIO.StringIO("") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from pandas.compat import StringIO
Current coverage is 84.75% (diff: 25.00%)@@ master #14947 diff @@
==========================================
Files 144 145 +1
Lines 51021 51150 +129
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 43188 43351 +163
+ Misses 7833 7799 -34
Partials 0 0
|
if format is None: | ||
buffer_error_msg = "If this is a buffer object rather\ | ||
than a string name, you must specify a format string" | ||
if not isinstance(filepath_or_buffer,compat.string_types): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have some PEP8 errors (we check for them in the travis build, that is the reason travis failed). For example, here there should be a space after the comma.
Here is the full list (you can find it at the bottom of the failing test (third travis build), but I also recommend setting up your IDE to check for this):
pandas/io/sas/sasreader.py:6:1: E302 expected 2 blank lines, found 1
pandas/io/sas/sasreader.py:35:45: E231 missing whitespace after ','
pandas/io/tests/sas/test_sas.py:5:1: E302 expected 2 blank lines, found 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I got this covered. I'll reread the Pandas submission info for how to ensure PEP8 compliance tonight.
if format is None: | ||
buffer_error_msg = "If this is a buffer object rather\ | ||
than a string name, you must specify a format string" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use
("..."
"...")
instead of \
for line continuation?
from pandas.compat import StringIO | ||
from pandas import read_sas | ||
|
||
class TestSasBuff(tm.TestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can call this just TestSas
(and there needs to be a blank line after this one)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@tomrod travis still reports some PEP8 errors:
(I would recommend setting your IDE to check for this as well) |
@jorisvandenbossche now passes $ git diff origin | flake8 --diff. Please let me know if you see any more issues. |
thanks! |
…ithout a format Author: tomrod <[email protected]> Closes pandas-dev#14947 from tomrod/sas_read_format_bugfix and squashes the following commits: 1285dbb [tomrod] flake8 testing 4cf9231 [tomrod] PEP8 compliance ab76d80 [tomrod] Updating to match pep8 whitespace requirements in sasreader.py ffdce1d [tomrod] Updating to match pep8 as per @jorisvandenbossche aa1ada3 [tomrod] More specific error message, moved imports to top of files bf60d23 [tomrod] Adding tests, creating updated information 5efdb85 [tomrod] Adding tests f8166fc [tomrod] Updated based on feedback from jreback b52f204 [tomrod] Clarified error in read_sas method when buffer object provided without format
…t format
git diff upstream/master | flake8 --diff
Added three lines to sasreader.py immediately following line 33 (if format==None:) to handle the case when a buffer object is provided without a format='sas7bdat' or format='xport' situation. Method otherwise works splendidly when a filepath is provided, but a buffer object fails. This is an issue when using sasreader directly on SFTP file objects. I am unaware of any bug request (and am happy to open one), but I came across this issue when using the library.