Skip to content

Add fix for overflow error in MXParser buffer sizing #72

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

mealingr
Copy link
Contributor

@mealingr mealingr commented Nov 25, 2019

This is to include this fix (original URL is down so referencing using a wayback machine link, also copied below) which helps prevent bufSoftLimit being updated to an integer overflowed value.

https://web.archive.org/web/20070831191548/http://www.extreme.indiana.edu/bugzilla/show_bug.cgi?id=228

When reading a very large text element (~34MB), MXParser gets itself into a bad
state and throws an IOException.  Here is the stack trace:

java.io.IOException: error reading input, returned 0
        at org.xmlpull.mxp1.MXParser.fillBuf(MXParser.java:3019)
        at org.xmlpull.mxp1.MXParser.more(MXParser.java:3025)
        at org.xmlpull.mxp1.MXParser.nextImpl(MXParser.java:1384)
        at org.xmlpull.mxp1.MXParser.next(MXParser.java:1093)

I traced the calls to Reader.read(char[], int, int), and here's what I saw:

...
read(char[33554432], 33529338, 8192): 8192
read(char[33554432], 33537530, 8192): 8192
read(char[33554432], 33545722, 8192): 8192
read(char[33554432], 33553914, 518): 518
read(char[33554432], 33554432, 0): 0
java.io.IOException: error reading input, returned 0

I think the problem is that the calculation of bufSoftLimit on line 2952 is
overflowing and becoming negative:

bufSoftLimit = ( bufLoadFactor * buf.length ) /100;

A solution would be to do the multiplication using 64-bit longs:

bufSoftLimit = (int) (( ((long) bufLoadFactor) * buf.length ) /100);

Or test for large values and change the order of operations to avoid overflow:

if (buf.length > 10000000) {
  bufSoftLimit = (buf.length / 100) * bufLoadFactor;
} else {
  bufSoftLimit = (bufLoadFactor * buf.length) / 100;
}

I am using version 1.1.3.4.M.
------- Comment #1 From Aleksander Slominski 2005-08-11 17:48:46 -------
interesting - i have nver seens this before! you musthave very large text
section in XML.

fixed in CVS and just cut a new release wit this fix (1.1.3.4.O) available in 
http://www.extreme.indiana.edu/dist/java-repository/xpp3/ 

thanks!

@@ -3656,7 +3656,8 @@ else if ( expand )
buf = newBuf;
if ( bufLoadFactor > 0 )
{
bufSoftLimit = ( bufLoadFactor * buf.length ) / 100;
// Include fix for https://web.archive.org/web/20070831191548/http://www.extreme.indiana.edu/bugzilla/show_bug.cgi?id=228
bufSoftLimit = (int) ((((long) bufLoadFactor) * buf.length) / 100);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if bufLoadFactor remains constant, then you can simplify: bufSoftLimit = 2*bufSoftLimit;

Copy link
Contributor Author

@mealingr mealingr Nov 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if bufLoadFactor remains constant, then you can simplify: bufSoftLimit = 2*bufSoftLimit;

I see what you mean because the only thing changing at the moment is the buf.len which is being doubled, so this calculation should essentially result in 2 times the current bufSoftLimit, but as we're not guaranteed that bufSoftLimit will remain constant (it's not declared as such) then erring on the side of accuracy over simplification seems to make sense here.

Copy link
Contributor

@belingueres belingueres Nov 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be modified but it is very unlikely: the field is protected, with no setter and no constructor setting the value, which means only is modifiable by subclassing MXParser. This design is very leaky. Even it represents a percentage of the total storage so (bufLoadFactor/100) could be pre-computed. We could define a new field protected float bufferLoadFactor = 0.95f; and deprecate the 'bufLoadFactor' field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be modified but it is very unlikely: the field is protected, with no setter and no constructor setting the value, which means only is modifiable by subclassing MXParser. This design is very leaky. Even it represents a percentage of the total storage so (bufLoadFactor/100) could be pre-computed. We could define a new field protected float bufferLoadFactor = 0.95f; and deprecate the 'bufLoadFactor' field.

Agree with your suggestion it helps simplify it 6c55f73

@@ -391,6 +391,23 @@ public void testSubsequentProcessingInstructionMoreThan8k()
assertEquals( XmlPullParser.END_TAG, parser.nextToken() );
}

@Test
public void testFillBuf_NoOverflow()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you write the test in terms of the public API? Accessing directly the fields and calling directy fillBuf() is not the normal use the parser. I'm thinking generating a big enough XML file, parsing it, showing that fails without the patch and pass the test with the patch in place.

Copy link
Contributor Author

@mealingr mealingr Nov 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you write the test in terms of the public API? Accessing directly the fields and calling directy fillBuf() is not the normal use the parser. I'm thinking generating a big enough XML file, parsing it, showing that fails without the patch and pass the test with the patch in place.

Sure a3f9276 (without the patch that test fails with java.io.IOException: error reading input, returned 0)

@hboutemy hboutemy closed this in ecdbeeb Dec 3, 2019
@hboutemy
Copy link
Member

hboutemy commented Dec 3, 2019

merged, thank you

@hboutemy hboutemy added this to the 3.3.1 milestone Dec 3, 2019
@hboutemy hboutemy added the bug label Dec 3, 2019
@hboutemy hboutemy self-assigned this Dec 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants