Skip to content

Sequence ']]>' is not escaped in PCDATA #130

Open
@FraGag

Description

@FraGag

The sequence ]]> must not occur in PCDATA, but xml-rs fails to escape the > in this sequence when emitted with EventWriter in an XmlEvent::Characters. xml-rs correctly gives an error when attempting to parse the erroneous document.

The fault probably lies on xml::escape::escape_str_pcdata, which handles escaping special characters for a PCDATA context. There are two options for resolving this: either systematically escaping > as > or checking for the sequence ]]> specifically and only escape > when it is part of that sequence.

Sample program to reproduce the problem:

extern crate xml;

use xml::reader::EventReader;
use xml::writer::EventWriter;
use xml::writer::events::XmlEvent;

fn main() {
    let mut v = Vec::new();
    {
        let mut ew = EventWriter::new(&mut v);
        ew.write(XmlEvent::start_element("root")).unwrap();
        ew.write(XmlEvent::characters("invalid ]]> invalid")).unwrap();
        ew.write(XmlEvent::end_element()).unwrap();
    }

    let er = EventReader::new(&v[..]);
    for ev in er {
        println!("{:?}", ev);
    }
}

Output:

Ok(StartDocument(1.0, utf-8, None))
Ok(StartElement(root, {"": "", "xml": "http://www.w3.org/XML/1998/namespace", "xmlns": "http://www.w3.org/2000/xmlns/"}))
Err(Error { pos: 1:53, kind: Syntax("Unexpected token: ]]>") })

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions