Closed
Description
@chris-b1 or anyone else, help a brother out? Can you tell me what this test does? It's just expecting the parser to throw an error? The output from the test code (where it's failing) is at the bottom. It's a pretty weird HTML file.
Now, if I call it with my current in-progress code as dfs = pd.read_html('computer_sales_page.html', header=[0, 1])
, I see:
Index([ (u'Unnamed: 0_level_0', u'Unnamed: 0_level_1'),
(u'Unnamed: 1_level_0', u'Unnamed: 1_level_1'),
(u'Three months ended April?30', u'2013'),
u'(u'Three months ended April\xa030', '2013').1',
(u'Three months ended April?30', u'Unnamed: 4_level_1'),
(u'Three months ended April?30', u'2012'),
u'(u'Three months ended April\xa030', '2012').1',
(u'Unnamed: 7_level_0', u'Unnamed: 7_level_1'),
(u'Six months ended April?30', u'2013'),
u'(u'Six months ended April\xa030', '2013').1',
(u'Six months ended April?30', u'Unnamed: 10_level_1'),
(u'Six months ended April?30', u'2012'),
u'(u'Six months ended April\xa030', '2012').1',
(u'Unnamed: 13_level_0', u'Unnamed: 13_level_1')],
dtype='object')
and if I call it without a header argument (dfs = pd.read_html('computer_sales_page.html')
), I see:
Index([ (u'Unnamed: 0_level_0', u'Unnamed: 0_level_1', u'Unnamed: 0_level_2'),
(u'Unnamed: 1_level_0', u'Unnamed: 1_level_1', u'Unnamed: 1_level_2'),
(u'Three months ended April?30', u'2013', u'In millions'),
u'(u'Three months ended April\xa030', '2013', 'In millions').1',
(u'Three months ended April?30', u'Unnamed: 4_level_1', u'In millions'),
(u'Three months ended April?30', u'2012', u'In millions'),
u'(u'Three months ended April\xa030', '2012', 'In millions').1',
(u'Unnamed: 7_level_0', u'Unnamed: 7_level_1', u'In millions'),
(u'Six months ended April?30', u'2013', u'In millions'),
u'(u'Six months ended April\xa030', '2013', 'In millions').1',
(u'Six months ended April?30', u'Unnamed: 10_level_1', u'In millions'),
(u'Six months ended April?30', u'2012', u'In millions'),
u'(u'Six months ended April\xa030', '2012', 'In millions').1',
(u'Unnamed: 13_level_0', u'Unnamed: 13_level_1', u'Unnamed: 13_level_2')],
dtype='object')
These seem like OK outputs to me. I'm not sure what the original test is supposed to show. I think I'd like to just delete the test if it's supposed to fail (and no longer fails).
____________________ TestReadHtml.test_computer_sales_page _____________________
self = <pandas.tests.io.test_html.TestReadHtml object at 0x1120aa390>
def test_computer_sales_page(self):
data = os.path.join(DATA_PATH, 'computer_sales_page.html')
with tm.assert_raises_regex(ParserError,
r"Passed header=\[0,1\] are "
r"too many rows for this "
r"multi_index of columns"):
> self.read_html(data, header=[0, 1])
pandas/tests/io/test_html.py:778:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <pandas.util.testing._AssertRaisesContextmanager object at 0x1120aab50>
exc_type = None, exc_value = None, trace_back = None
def __exit__(self, exc_type, exc_value, trace_back):
expected = self.exception
if not exc_type:
exp_name = getattr(expected, "__name__", str(expected))
> raise AssertionError("{0} not raised.".format(exp_name))
E AssertionError: ParserError not raised.
pandas/util/testing.py:2491: AssertionError