Skip to content

Buffer performance improvements #791

Closed
@Tyriar

Description

@Tyriar

Problem

Memory

Right now our buffer is taking up too much memory, particularly for an application that launches multiple terminals with large scrollbacks set. For example, the demo using a 160x24 terminal with 5000 scrollback filled takes around 34mb memory (see microsoft/vscode#29840 (comment)), remember that's just a single terminal and 1080p monitors would likely use wider terminals. Also, in order to support truecolor (#484), each character will need to store 2 additional number types which will almost double the current memory consumption of the buffer.

Slow fetching of a row's text

There is the other problem of needing to fetch the actual text of a line swiftly. The reason this is slow is due to the way that the data is laid out; a line contains an array of characters, each having a single character string. So we will construct the string and then it will be up for garbage collection immediately afterwards. Previously we didn't need to do this at all because the text is pulled from the line buffer (in order) and rendered to the DOM. However, this is becoming an increasingly useful thing to do though as we improve xterm.js further, features like the selection and links both pull this data. Again using the 160x24/5000 scrollback example, it takes 30-60ms to copy the entire buffer on a Mid-2014 Macbook Pro.

Supporting the future

Another potential problem in the future is when we look at introducing some view model which may need to duplicate some or all of the data in the buffer, this sort of thing will be needed to implement reflow (#622) properly (#644 (comment)) and maybe also needed to properly support screen readers (#731). It would certainly be good to have some wiggle room when it comes to memory.

This discussion started in #484, this goes into more detail and proposes some additional solution.

I'm leaning towards solution 3 and moving towards solution 5 if there is time and it shows a marked improvement. Would love any feedback! /cc @jerch, @mofux, @rauchg, @parisk

1. Simple solution

This is basically what we're doing now, just with truecolor fg and bg added.

// [0]: charIndex
// [1]: width
// [2]: attributes
// [3]: truecolor bg
// [4]: truecolor fg
type CharData = [string, number, number, number, number];

type LineData = CharData[];

Pros

  • Very simple

Cons

  • Too much memory consumed, would nearly double our current memory usage which is already too high.

2. Pull text out of CharData

This would store the string against the line rather than the line, this would probably see very large gains in selection and linkifying and would be more useful as time goes on having quick access to a line's entire string.

interface ILineData {
  // This would provide fast access to the entire line which is becoming more
  // and more important as time goes on (selection and links need to construct
  // this currently). This would need to reconstruct text whenever charData
  // changes though. We cannot lazily evaluate text due to the chars not being
  // stored in CharData
  text: string;
  charData: CharData[];
}

// [0]: charIndex
// [1]: attributes
// [2]: truecolor bg
// [3]: truecolor fg
type CharData = Int32Array;

Pros

  • No need to reconstruct the line whenever we need it.
  • Lower memory than today due to the use of an Int32Array

Cons

  • Slow to update individual characters, the entire string would need to be regenerated for single character changes.

3. Store attributes in ranges

Pulling the attributes out and associating them with a range. Since there can never be overlapping attributes, this can be laid out sequentially.

type LineData = CharData[]

// [0]: The character
// [1]: The width
type CharData = [string, number];

class CharAttributes {
  public readonly _start: [number, number];
  public readonly _end: [number, number];
  private _data: Int32Array;

  // Getters pull data from _data (woo encapsulation!)
  public get flags(): number;
  public get truecolorBg(): number;
  public get truecolorFg(): number;
}

class Buffer extends CircularList<LineData> {
  // Sorted list since items are almost always pushed to end
  private _attributes: CharAttributes[];

  public getAttributesForRows(start: number, end: number): CharAttributes[] {
    // Binary search _attributes and return all visible CharAttributes to be
    // applied by the renderer
  }
}

Pros

  • Lower memory than today even though we're also storing truecolor data
  • Can optimize application of attributes, rather than checking every single character's attribute and diffing it to the one before
  • Encapsulates the complexity of storing the data inside an array (.flags instead of [0])

Cons

  • Changing attributes of a range of characters inside another range is more complex

4. Put attributes in a cache

The idea here is to leverage the fact that there generally aren't that many styles in any one terminal session, so we should not create as few as necessary and reuse them.

// [0]: charIndex
// [1]: width
type CharData = [string, number, CharAttributes];

type LineData = CharData[];

class CharAttributes {
  private _data: Int32Array;

  // Getters pull data from _data (woo encapsulation!)
  public get flags(): number;
  public get truecolorBg(): number;
  public get truecolorFg(): number;
}

interface ICharAttributeCache {
  // Never construct duplicate CharAttributes, figuring how the best way to
  // access both in the best and worst case is the tricky part here
  getAttributes(flags: number, fg: number, bg: number): CharAttributes;
}

Pros

  • Similar memory usage to today even though we're also storing truecolor data
  • Encapsulates the complexity of storing the data inside an array (.flags instead of [0])

Cons

  • Less memory savings than the ranges approach

5. Hybrid of 3 & 4

type LineData = CharData[]

// [0]: The character
// [1]: The width
type CharData = [string, number];

class CharAttributes {
  private _data: Int32Array;

  // Getters pull data from _data (woo encapsulation!)
  public get flags(): number;
  public get truecolorBg(): number;
  public get truecolorFg(): number;
}

interface CharAttributeEntry {
  attributes: CharAttributes,
  start: [number, number],
  end: [number, number]
}

class Buffer extends CircularList<LineData> {
  // Sorted list since items are almost always pushed to end
  private _attributes: CharAttributeEntry[];
  private _attributeCache: ICharAttributeCache;

  public getAttributesForRows(start: number, end: number): CharAttributeEntry[] {
    // Binary search _attributes and return all visible CharAttributeEntry's to
    // be applied by the renderer
  }
}

interface ICharAttributeCache {
  // Never construct duplicate CharAttributes, figuring how the best way to
  // access both in the best and worst case is the tricky part here
  getAttributes(flags: number, fg: number, bg: number): CharAttributes;
}

Pros

  • Protentially the fastest and most memory efficient
  • Very memory efficient when the buffer contains many blocks with styles but only from a few styles (the common case)
  • Encapsulates the complexity of storing the data inside an array (.flags instead of [0])

Cons

  • More complex than the other solutions, it may not be worth including the cache if we already keep a single CharAttributes per block?
  • Extra overhead in CharAttributeEntry object
  • Changing attributes of a range of characters inside another range is more complex

6. Hybrid of 2 & 3

This takes the solution of 3 but also adds in a lazily evaluates text string for fast access to the line text. Since we're also storing the characters in CharData we can lazily evaluate it.

type LineData = {
  text: string,
  CharData[]
}

// [0]: The character
// [1]: The width
type CharData = [string, number];

class CharAttributes {
  public readonly _start: [number, number];
  public readonly _end: [number, number];
  private _data: Int32Array;

  // Getters pull data from _data (woo encapsulation!)
  public get flags(): number;
  public get truecolorBg(): number;
  public get truecolorFg(): number;
}

class Buffer extends CircularList<LineData> {
  // Sorted list since items are almost always pushed to end
  private _attributes: CharAttributes[];

  public getAttributesForRows(start: number, end: number): CharAttributes[] {
    // Binary search _attributes and return all visible CharAttributes to be
    // applied by the renderer
  }

  // If we construct the line, hang onto it
  public getLineText(line: number): string;
}

Pros

  • Lower memory than today even though we're also storing truecolor data
  • Can optimize application of attributes, rather than checking every single character's attribute and diffing it to the one before
  • Encapsulates the complexity of storing the data inside an array (.flags instead of [0])
  • Faster access to the actual line string

Cons

  • Extra memory due to hanging onto line strings
  • Changing attributes of a range of characters inside another range is more complex

Solutions that won't work

  • Storing the string as an int inside an Int32Array will not work as it takes far to long to convert the int back to a character.

Metadata

Metadata

Assignees

Labels

area/performancetype/planA meta issue that consists of several sub-issuestype/proposalA proposal that needs some discussion before proceeding

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions