Open
Description
Feature Type
-
Adding new functionality to pandas
-
Changing existing functionality in pandas
-
Removing existing functionality in pandas
Problem Description
read_xml()
fails with the error message XMLSyntaxError: xmlSAX2Characters: huge text node
.
A similar problem can be overcome when manually parsing the tree like so:
from lxml import etree
with open(filename) as f:
tree = etree.parse(f, etree.XMLParser(huge_tree=True))
Feature Description
I am not sure what the best way to supply options to the parser would be.
Alternative Solutions
Right now, I have to read the file using the 'etree' parser like so
df = pd.read_xml(
filename,
parser='etree',
)
Additional Context
Similarly, the following option could be passed to the parser recover=True
.