Piece Table with JS

Iskenderov Viacheslav - Sep 9 - - Dev Community

Hello there!

I believe many of you know how it feels when you want to write a pet-project of some kind, but don’t know what exactly is it you want to do. Once I got this feeling and started surfing through the internet to find some interesting ideas. Eventually, I got to this article and realised that the text editor idea sounds amazing.

After some time thinking about how I can implement it, it came to me that using a string array for storing text seemed a little boring, so I decided to use Piece Table. However, it wasn’t hard to notice that there are too few information about this data structure, even less in JS. So here I am to share my implementation of Piece Table that I used for my text editor.

Text Editor

For better understanding of the responsibilities my Piece Table was supposed to have, let’s talk a little about the structure of the text editor in general.

I am a web developer with JS/TS as a main language, so I decided not to experiment with other language and focus specifically on the text editor implementation. The idea was to take HTML canvas element and create a simplified version of the textarea element.

Entities my editor has
Entities my editor has

Following the single responsibility principle I decided to separate the logic by different classes (it’s not all of them, but main ones). Thanks to this structure I managed to separate the text storage logic as much as possible. (Piece Table here is Storage)

So I needed a pretty simple interface from the Storage:

  • Listing all the text by lines
  • Text insertion
  • Text deletion (one character as well as multiple characters)
  • Getting amount of lines the text has and each line length

Piece Table

Piece Table is basically a Linked List with a few nuances.

Unlike a string array, the text here is stored as 2 strings variables called buffers. The first buffer is called original and stands for the initial content of a document the editor is supposed to show. The second is called an additional buffer and it keeps track of all characters added during the document editing.

Another entity of the Piece Table is a Piece. Basically, it’s just a node that contains information about a chunk of edited text. To better understand it, let’s look at the example below.

Let’s say we have a file that contains a text “Hello World”. At the moment of reading the file content, we have one Piece containing the following information: buffer source, text length, offset (the letter we are starting reading from in a specific buffer).

Hello World

Then we decide to an exclamation point so the text would look like “Hello World!”.

Hello World!

We created a new Piece and since the content was added during the text editing, its buffer is not original now. Every time we add a new text, we create a new Piece and link it to the previous ones.

When we need to editing the existing text. Let’s say we want add a coma after the “Hello”, we split the existing Piece and insert a new Piece between. That’s where we need the offset property.

Hello, World!

We split the original Piece into 2 pieces and inserted a new piece between. At this point, the original buffer is the same as it was in the beginning, but the additional one is equal to ***“!,”. Since the coma was added later in time, it is in the end of the additional buffer, therefore the offset is 1 (index of the coma in the string is 1). Same thing with the “ **World*” offset, it is still in the original buffer, but this chunk of text starts from the 5th element in the original string.

Here is the code in TypeScript:

type Position = {
  line: number;
  character: number;
}

enum Source {
  ORIGINAL = "original",
  ADD = "add",
}

class Piece {
  offset: number;
  length: number;
  source: Source;
  next: Piece | null;

  constructor(
    offset: number,
    length: number,
    source: Source,
    next: Piece | null
  ) {
    this.length = length;
    this.offset = offset;
    this.source = source;
    this.next = next;
  }
} 

export class Storage {
  original: string = "";
  add: string = "";

  pieceHead: Piece | null = new Piece(0, 0, Source.ORIGINAL, null);

  constructor(content: string = "") {
    this.original = content;
    this.pieceHead = new Piece(0, content.length, Source.ORIGINAL, null);
  }

  read() {}
  insert(content: string, at: Position) {}
  delete(at: Position) {}
  deleteRange(start: Position, end: Position) {}
}
Enter fullscreen mode Exit fullscreen mode

That’s the initialisation of the Piece Table structure. So what we need to do is to implement the methods: read, insert, delete and deleteRange.

Implementation

Let’s start with implementing the read method.

read() {
  let head: Piece | null = this.pieceHead;
  let line = "";
  const lines = [];

  while (head !== null) {
      const source = head.source === Source.ORIGINAL ? this.original : this.add;

    const content = source.slice(head.offset, head.offset + head.length);

    for (const letter of content) {
        if (letter === "\n") {
          lines.push(line);

          line = "";
        continue;
      }

      line += letter;
    }

    head = head.next;
  }

  if (line !== "") {
      lines.push(line);
  }

  return lines;
}
Enter fullscreen mode Exit fullscreen mode

What happens here is basically concatenation of the text and splitting it by lines. It iterates through the whole Piece chain extracting the chunks of text from the corresponding buffers.

insert(content: string, at: Position) {
    const { piece, offset, previous } = this.findPieceByLine(at);

    if (!piece) {
        if (this.pieceHead === null) {
            this.add += content;
            this.pieceHead = new Piece(0, content.length, Source.ADD, null);
            return;
        }

        return;
    }

    const nextPiece = new Piece(
        piece.offset + offset,
        piece.length - offset,
        piece.source,
        piece.next
    );
    const currentPiece = new Piece(
        this.add.length,
        content.length,
        Source.ADD,
        nextPiece
    );
    this.add += content;
    piece.length = offset;

    if (previous && piece.length === 0) {
        previous.next = currentPiece;
    } else if (piece.length === 0) {
        this.pieceHead = currentPiece;
    } else {
        piece.next = currentPiece;
    }
}

private findPieceByLine(position: Position): {
    offset: number;
    piece: Piece | null;
    previous: Piece | null;
} {
    const { line, character } = position;
    let head: Piece | null = this.pieceHead;
    let previous: Piece | null = null;
    let currentLine = 0;
    let currentCharacter = 0;

    // searching for the piece that starts the line
    while (head !== null) {
        const source = head.source === Source.ORIGINAL ? this.original : this.add;
      const content = source.slice(head.offset, head.offset + head.length);

      for (let i = 0; i < content.length; i++) {
        const letter = content[i];

        if (currentLine === line && currentCharacter === character) {
          return { piece: head, offset: i, previous };
        }

        if (letter === "\n") {
          currentLine++;
          currentCharacter = 0;
        } else {
          currentCharacter++;
        }
      }

      // if nothing found, let the user type in the end
      if (head.next === null) {
        return { piece: head, offset: head.length, previous };
      }

      previous = head;
      head = head.next;
    }

    return { piece: null, offset: 0, previous: null };
}
Enter fullscreen mode Exit fullscreen mode

Method findPieceByLine is searching for the piece by its line and character. Basically, it’s the same logic as in the read method, but targeted on searching the piece instead of concatenating the text.

Once the piece is found, it’s pretty easy. It’s either create the first Piece or split already existing ones.

delete(at: Position) {
  const { piece, offset, previous } = this.findPieceByLine(at);

  if (!piece) {
    return;
  }

  if (offset === 0 && previous) {
    this.removeLastCharacterOfPiece(previous);
    return;
  } else if (offset === 0) {
    // cursor is behind the first character in the text
    return;
  }

    // if the offset and length are equal to 1, the piece needs to be deleted
  if (piece.length === 1 && offset === 1) {
    this.removePiece(piece);
    return;
  }

  if (offset === 1 && piece.length > 0) {
    piece.length--;
    piece.offset++;
    return;
  }

  if (offset === piece.length - 1) {
    piece.length--;
    return;
  }

    // if the removed character is in the middle of a piece,
    // we split it in 2 (reducing the length of the first one and creating a new one)
  const newPiece = new Piece(
    piece.offset + offset,
    piece.length - offset,
    piece.source,
    piece.next
  );

  piece.next = newPiece;
  piece.length = offset - 1;
}

private removeLastCharacterOfPiece(piece: Piece) {
  piece.length--;

  if (piece.length === 0) {
    this.removePiece(piece);
  }
}

private removePiece(piece: Piece) {
  let head = this.pieceHead;
  let previous = null;

  while (head !== null) {
    if (head === piece) {
      if (previous) {
        previous.next = head.next;
      } else {
        this.pieceHead = head.next;
      }
      return;
    }

    previous = head;
    head = head.next;
  }
}
Enter fullscreen mode Exit fullscreen mode

And the final method is deleteRange.

deleteRange(start: Position, end: Position) {
  // should be moved to a single function if performance is needed
  const { piece: startPiece, offset: startOffset } =
    this.findPieceByLine(start);
  const { piece: endPiece, offset: endOffset } = this.findPieceByLine(end);

  if (!startPiece || !endPiece) {
    return;
  }

    // usually happens to the original piece or after copy/paste
  if (startPiece === endPiece) {
      // if the range starts at the piece beginning, we move offset
      // to the right and reduce the length
    if (startOffset === 0) {
      endPiece.offset += endOffset;
      endPiece.length -= endOffset;
      return;
    }

        // and if the range is meeting the end of the piece, it we can
        // safely reduce length only, because the offset stays the same
    if (endOffset === startPiece.length) {
      startPiece.length = startOffset;
      return;
    }

        // otherwise, we split the piece in 2 (similary to the **delete** method)
    startPiece.next = new Piece(
      startPiece.offset + endOffset,
      startPiece.length - endOffset,
      startPiece.source,
      startPiece.next
    );
    startPiece.length = startOffset;

    return;
  }

    // if that's 2 different pieces, we update the sizes of the pieces
  startPiece.length = startOffset;

  endPiece.offset += endOffset;
  endPiece.length -= endOffset;

  let piece = startPiece.next;
  // and remove all the pieces betweens the start one and the end one
  while (piece !== endPiece && piece !== null) {
    startPiece.next = piece.next;
    piece = piece.next;
  }

    // if any of the targeted pieces have length 0, we have to remove them
  if (startPiece.length === 0) {
    this.removePiece(startPiece);
  }

  if (endPiece.length === 0) {
    this.removePiece(endPiece);
  }
}
Enter fullscreen mode Exit fullscreen mode

Conclusion

That’s pretty much it. With these operations you can easily write an editor like Vim for instance. Obviously, there are many optimisations as well as refactoring that can be made here, but this article was written merely to give an idea on how to implement the Piece Table.

If you have any comments, feel free to leave them here. I’ll be happy to read it!

Github: https://github.com/werfit/text-editor-piece-table

Editor: https://werfit-editor.vercel.app/

.
Terabox Video Player