Clarified error in read_sas method when buffer object provided withou… #14947

tomrod · 2016-12-21T22:01:26Z

…t format

closes Clarified error in read_sas method when buffer object provided withou… #14947
tests added / passed
passes git diff upstream/master | flake8 --diff
whatsnew entry

Added three lines to sasreader.py immediately following line 33 (if format==None:) to handle the case when a buffer object is provided without a format='sas7bdat' or format='xport' situation. Method otherwise works splendidly when a filepath is provided, but a buffer object fails. This is an issue when using sasreader directly on SFTP file objects. I am unaware of any bug request (and am happy to open one), but I came across this issue when using the library.

…t format

jreback · 2016-12-21T22:31:52Z

pandas/io/sas/sasreader.py

@@ -31,6 +31,9 @@ def read_sas(filepath_or_buffer, format=None, index=None, encoding=None,
    """

    if format is None:
+        bufferr = "Format unrecognized. If buffer object, specify format")
+        if type(filesize_or_buffer) != str:


use not isinstance(filesize_or_buffer, compat.string_types)

bufferr -> buffer

bufferr -- buffer error. Didn't want to conflict with anything downstream.

compat.string_types doesn't appear to be a standard library item? Instead using:

if type(filepath_or_buffer) not in [str, unicode]: raise TypeError(BuffErr)

from pandas import compat
pls do it this way

Will do, thanks for the pointer.

jreback · 2016-12-21T22:32:31Z

doc/source/whatsnew/v0.20.0.txt

@@ -271,6 +271,7 @@ Performance Improvements
 - Improved performance of ``pd.wide_to_long()`` (:issue:`14779`)
 - Increased performance of ``pd.factorize()`` by releasing the GIL with ``object`` dtype when inferred as strings (:issue:`14859`)

+- When reading buffer object in ``read_sas()`` method without specified format, filepath string is inferred rather than buffer object. Error is now thrown if buffer object is provided without format {sas7bdat|xport} specification.


just the first sentence.

put in bug fix section; use this PR number as the issue number.

jreback · 2016-12-21T22:32:51Z

please add a test (you can use StringIO() as a test buffer)

tomrod · 2016-12-21T22:33:57Z

This is my first PR and I don't do much dev--can you expand a bit on the test you have in mind?

…

On Dec 21, 2016 17:33, "Jeff Reback" ***@***.***> wrote: please add a test (you can use StringIO() as a test buffer) — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#14947 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABo9lMcvwwYeuxTKQnAGkCU1srG1qdToks5rKakcgaJpZM4LTa7o> .

TomAugspurger · 2016-12-21T22:37:55Z

The test should basically be a simplified version of your code that discovered the problem. The test should fail without your fix, and pass with it.

Maybe something like

def test_sas_buffer_format(self):
    b = StringIO("")
    with self.assertRaises(TypeError):
        result = pd.read_sas(b)

Haven't tested that, so double check. That can go with the other tests in pandas/io/tests/sas/test_sas.py

tomrod · 2016-12-21T22:38:48Z

Got it. Much obliged.

…

On Dec 21, 2016 17:38, "Tom Augspurger" ***@***.***> wrote: The test should basically be a simplified version of your code that discovered the problem. The test should fail without your fix, and pass with it. Maybe something like def test_sas_buffer_format(self): b = StringIO("") with self.assertRaises(TypeError): result = pd.read_sas(b) Haven't tested that, so double check. That can go with the other tests in pandas/io/tests/sas/test_sas.py — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#14947 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABo9lNMXg-se9IxqfkPH07Hk7oCN8DjAks5rKapNgaJpZM4LTa7o> .

jreback · 2016-12-30T21:44:35Z

can you update

tomrod · 2016-12-31T04:08:27Z

Updated

jreback · 2016-12-31T15:58:17Z

doc/source/whatsnew/v0.20.0.txt

@@ -271,6 +271,7 @@ Performance Improvements
 - Improved performance of ``pd.wide_to_long()`` (:issue:`14779`)
 - Increased performance of ``pd.factorize()`` by releasing the GIL with ``object`` dtype when inferred as strings (:issue:`14859`)

+- When reading buffer object in ``read_sas()`` method without specified format, filepath string is inferred rather than buffer object.


Bug when reading a buffer object in pd.read_sas(), without a specified format, a filepath string was inferred rather than buffer object.

move to Bug Fix section, add the issue number (this PR) at the end

jreback · 2016-12-31T15:59:35Z

pandas/io/sas/sasreader.py

    if format is None:
+        buffErr = "Format unrecognized. If buffer object, specify format"


use buffer_error_msg.

If this is a buffer object, rather than a string name, you must specify a format string.

jreback · 2016-12-31T15:59:54Z

pandas/io/sas/sasreader.py

@@ -29,8 +29,11 @@ def read_sas(filepath_or_buffer, format=None, index=None, encoding=None,
    DataFrame if iterator=False and chunksize=None, else SAS7BDATReader
    or XportReader
    """
-
+    from pandas import compat


import should be at the top of the file (if might be there already)

jreback · 2016-12-31T16:01:21Z

pandas/io/tests/sas/test_sas.py

+
+class TestSasBuff(tm.TestCase):
+    def test_sas_buffer_format(self):
+        import StringIO


imports go at the top

jreback · 2016-12-31T16:01:44Z

pandas/io/tests/sas/test_sas.py

+    def test_sas_buffer_format(self):
+        import StringIO
+        from pandas.io.sas.sasreader import read_sas
+        b = StringIO.StringIO("")


from pandas import read_sas

jreback · 2016-12-31T16:02:00Z

pandas/io/tests/sas/test_sas.py

+    def test_sas_buffer_format(self):
+        import StringIO
+        from pandas.io.sas.sasreader import read_sas
+        b = StringIO.StringIO("")


from pandas.compat import StringIO

codecov-io · 2016-12-31T21:33:52Z

Current coverage is 84.75% (diff: 25.00%)

Merging #14947 into master will increase coverage by 0.10%

@@             master     #14947   diff @@
==========================================
  Files           144        145     +1   
  Lines         51021      51150   +129   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43188      43351   +163   
+ Misses         7833       7799    -34   
  Partials          0          0

Powered by Codecov. Last update f79bc7a...1285dbb

jorisvandenbossche · 2017-01-02T09:20:54Z

pandas/io/sas/sasreader.py

    if format is None:
+        buffer_error_msg = "If this is a buffer object rather\
+                than a string name, you must specify a format string"
+        if not isinstance(filepath_or_buffer,compat.string_types):


You have some PEP8 errors (we check for them in the travis build, that is the reason travis failed). For example, here there should be a space after the comma.

Here is the full list (you can find it at the bottom of the failing test (third travis build), but I also recommend setting up your IDE to check for this):

pandas/io/sas/sasreader.py:6:1: E302 expected 2 blank lines, found 1 pandas/io/sas/sasreader.py:35:45: E231 missing whitespace after ',' pandas/io/tests/sas/test_sas.py:5:1: E302 expected 2 blank lines, found 1

I think I got this covered. I'll reread the Pandas submission info for how to ensure PEP8 compliance tonight.

jorisvandenbossche · 2017-01-02T09:22:32Z

pandas/io/sas/sasreader.py

    if format is None:
+        buffer_error_msg = "If this is a buffer object rather\
+                than a string name, you must specify a format string"


Can you use

("..." "...")

instead of \ for line continuation?

jorisvandenbossche · 2017-01-02T09:23:29Z

pandas/io/tests/sas/test_sas.py

+from pandas.compat import StringIO
+from pandas import read_sas
+
+class TestSasBuff(tm.TestCase):


You can call this just TestSas (and there needs to be a blank line after this one)

jorisvandenbossche · 2017-01-04T13:44:29Z

@tomrod travis still reports some PEP8 errors:

pandas/io/sas/sasreader.py:6:1: E302 expected 2 blank lines, found 1
pandas/io/tests/sas/test_sas.py:5:1: E302 expected 2 blank lines, found 1

(I would recommend setting your IDE to check for this as well)

tomrod · 2017-01-04T14:18:17Z

@jorisvandenbossche now passes $ git diff origin | flake8 --diff. Please let me know if you see any more issues.

jreback · 2017-01-09T18:33:10Z

thanks!

@jorisvandenbossche

…ithout a format Author: tomrod <[email protected]> Closes pandas-dev#14947 from tomrod/sas_read_format_bugfix and squashes the following commits: 1285dbb [tomrod] flake8 testing 4cf9231 [tomrod] PEP8 compliance ab76d80 [tomrod] Updating to match pep8 whitespace requirements in sasreader.py ffdce1d [tomrod] Updating to match pep8 as per @jorisvandenbossche aa1ada3 [tomrod] More specific error message, moved imports to top of files bf60d23 [tomrod] Adding tests, creating updated information 5efdb85 [tomrod] Adding tests f8166fc [tomrod] Updated based on feedback from jreback b52f204 [tomrod] Clarified error in read_sas method when buffer object provided without format

Clarified error in read_sas method when buffer object provided withou…

b52f204

…t format

jreback reviewed Dec 21, 2016

View reviewed changes

jreback added Bug IO SAS SAS: read_sas labels Dec 21, 2016

tomrod added 3 commits December 30, 2016 19:55

Updated based on feedback from jreback

f8166fc

Adding tests

5efdb85

Adding tests, creating updated information

bf60d23

jreback reviewed Dec 31, 2016

View reviewed changes

jreback requested changes Dec 31, 2016

View reviewed changes

More specific error message, moved imports to top of files

aa1ada3

jorisvandenbossche reviewed Jan 2, 2017

View reviewed changes

tomrod added 3 commits January 3, 2017 17:54

Updating to match pep8 as per @jorisvandenbossche

ffdce1d

Updating to match pep8 whitespace requirements in sasreader.py

ab76d80

PEP8 compliance

4cf9231

flake8 testing

1285dbb

jreback added this to the 0.20.0 milestone Jan 9, 2017

jreback closed this in e7eefc4 Jan 9, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarified error in read_sas method when buffer object provided withou… #14947

Clarified error in read_sas method when buffer object provided withou… #14947

tomrod commented Dec 21, 2016 •

edited

Loading

jreback Dec 21, 2016

jreback Dec 30, 2016

tomrod Dec 30, 2016

tomrod Dec 30, 2016 •

edited

Loading

jreback Dec 30, 2016

tomrod Dec 31, 2016

jreback Dec 21, 2016

jreback commented Dec 21, 2016

tomrod commented Dec 21, 2016 via email

TomAugspurger commented Dec 21, 2016

tomrod commented Dec 21, 2016 via email

jreback commented Dec 30, 2016

tomrod commented Dec 31, 2016

jreback Dec 31, 2016 •

edited

Loading

jreback Dec 31, 2016

jreback Dec 31, 2016

jreback Dec 31, 2016

jreback Dec 31, 2016

jreback Dec 31, 2016

codecov-io commented Dec 31, 2016 •

edited

Loading

jorisvandenbossche Jan 2, 2017

tomrod Jan 3, 2017

jorisvandenbossche Jan 2, 2017

jorisvandenbossche Jan 2, 2017

tomrod Jan 3, 2017

jorisvandenbossche commented Jan 4, 2017

tomrod commented Jan 4, 2017

jreback commented Jan 9, 2017

		if format is None:
		buffErr = "Format unrecognized. If buffer object, specify format"

Clarified error in read_sas method when buffer object provided withou… #14947

Clarified error in read_sas method when buffer object provided withou… #14947

Conversation

tomrod commented Dec 21, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tomrod Dec 30, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Dec 21, 2016

tomrod commented Dec 21, 2016 via email

TomAugspurger commented Dec 21, 2016

tomrod commented Dec 21, 2016 via email

jreback commented Dec 30, 2016

tomrod commented Dec 31, 2016

jreback Dec 31, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-io commented Dec 31, 2016 • edited Loading

Current coverage is 84.75% (diff: 25.00%)

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jorisvandenbossche commented Jan 4, 2017

tomrod commented Jan 4, 2017

jreback commented Jan 9, 2017

tomrod commented Dec 21, 2016 •

edited

Loading

tomrod Dec 30, 2016 •

edited

Loading

jreback Dec 31, 2016 •

edited

Loading

codecov-io commented Dec 31, 2016 •

edited

Loading