Skip to content

Safari: TextDecoder refuses to process data after 2GB causing RangeError (Bad value) and Out of bounds memory access #4471

Open
@bes

Description

@bes

Describe the Bug

After processing 2GB of data, Safari's TextDecoder refuses to produce any more strings, apparently some sort of (successful) security mitigation that had unintended consequences.

We are seeing lots (thousands per week) of crashes in Sentry from Safari versions 16 up to the latest 18.4, with crashes saying Out of bounds memory access and RangeError(BadValue).

There is a bug report on the WebKit bugzilla, but it is unlikely to get fixed for older Safari versions (and they don't seem to be in a hurry to fix it, was reported 2024-09).

I initially started another bug report on WebKit bugzilla, but am now reasonably convinced that the TextDecoder is the culprit for our issues too.

Steps to Reproduce

  1. Open any Safari version 16 - 18.4
  2. Open the Safari console on any site
  3. Run the provided script from the bug report
const text = new TextDecoder()
const buff = new ArrayBuffer(100)
for (let i = 0; i < 21474837; i++) {
    text.decode(buff)
}
  1. See that RangeError: Bad value is produced
    Image

When this happens in wasm-bindgen generated program (and it does happen for us, all the time) it's game over for that browser/tab instance.

Expected Behavior

I expect Safari to fix this at the source, but I think wasm-bindgen could provide a workaround for Safari specifically, creating a new TextDecoder when decoding is about to pass 2GB of data.

Actual Behavior

After processing 2GB of text, Safari's TextDecoder won't be able to decode anymore.

Additional Context

This isn't an optimal situation for anyone, but I hope that we can reach consensus on what needs to be done here.

wasm-bindgen generates the following code for me:

const lTextDecoder = typeof TextDecoder === 'undefined' ? (0, module.require)('util').TextDecoder : TextDecoder;

let cachedTextDecoder = new lTextDecoder('utf-8', { ignoreBOM: true, fatal: true });

function getStringFromWasm0(ptr, len) {
    ptr = ptr >>> 0;
    return cachedTextDecoder.decode(getUint8ArrayMemory0().subarray(ptr, ptr + len));
}

Some possible workarounds:

  • For every xx_wasm_bg.js create a function to retrieve a TextDecoder, keeping track of the amount of decoded bytes, creating new TextDecoders as needed - new decoder for Safari, keep the same one for other browsers
  • Same as above but no browser checking
  • Make any of the above optional (new command line argument)
  • Instead of providing this for everyone, make it easier to patch this code by some mechanism in wasm-bindgen for robustness (instead of monkey patching)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions