Skip to content

Fix edge-case in DOM parsing decoding #16226

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

nielsdos
Copy link
Member

@nielsdos nielsdos commented Oct 4, 2024

There are three connected subtle issues:

  1. The fast path didn't correctly handle the case where the decoder
    requests more data. This caused a bogus additional replacement
    sequence to be outputted when encountering an incomplete sequence at
    the edges of a buffer.
  2. The finishing of decoding incorrectly assumed that the fast path
    cannot be in a state where the last few bytes were an incomplete
    sequence, but this is not true as shown by test 08.
  3. The finishing of decoding could output bytes twice because it called
    into dom_process_parse_chunk() twice without clearing the decoded
    data. However, calling twice is not even necessary as the entire
    buffer cannot be filled up entirely.

@nielsdos nielsdos force-pushed the dom-encoding-edge-case branch 2 times, most recently from b9c4814 to e9fea87 Compare October 4, 2024 22:00
There are three connected subtle issues:
1) The fast path didn't correctly handle the case where the decoder
   requests more data. This caused a bogus additional replacement
   sequence to be outputted when encountering an incomplete sequence at
   the edges of a buffer.
2) The finishing of decoding incorrectly assumed that the fast path
   cannot be in a state where the last few bytes were an incomplete
   sequence, but this is not true as shown by test 08.
3) The finishing of decoding could output bytes twice because it called
   into dom_process_parse_chunk() twice without clearing the decoded
   data. However, calling twice is not even necessary as the entire
   buffer cannot be filled up entirely.
Copy link
Member

@Girgias Girgias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the test output this seems correct

@nielsdos nielsdos closed this in 1e949d1 Oct 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants