Tracking issue: refactor string packages handling grapheme clusters in terms of "base" packages

The purpose of this issue is to track tasks related to the effort to refactor string packages handling grapheme clusters to use "base" packages which handle more specialized use cases.

## Overview

String packages, such as `@stdlib/string/first`, have several possible "modes" of operation. When getting the first character, a straightforward approach would use indexing. E.g.,

```javascript
var str = 'Hello, World!';

var ch = str[ 0 ];
// returns 'H'
```

This works according to user expectation so long as a character is a relatively common character which can be stored in a single UTF-16 code unit. However, this inevitably does not live up to user intuition when the first **visual** character is comprised of multiple code units.

As such, one has three options for resolving the first character:

- code units
- code points (one or more code units)
- grapheme clusters (one or more code points)

The most robust approach for matching user intuition is to resolve grapheme clusters (i.e., user-perceived visual characters), especially for text which may include emojis with skin tones and modified characteristics. However, resolving grapheme clusters is comparatively slow and may lead to unacceptable performance issues, especially when working with simple text.

## Solution

Rather than provide a single API which only processes text as a sequence of grapheme clusters, the proposed solution is to refactor top-level `@stdlib/string/*` packages which handle grapheme clusters to support different "modes" of operation, whereby a user can choose which type of processing is most appropriate for given input strings.

Internally, packages supporting different modes should rely on separate, specialized "base" packages (`@stdlib/string/base/*`) which implement appropriate algorithms for resolving code units, code points, and grapheme clusters, respectively.

## Prior Art

For examples of refactorings, see

- `@stdlib/string/first`
    - `@stdlib/string/base/first`
    - `@stdlib/string/base/first-code-point`
    - `@stdlib/string/base/first-grapheme-cluster`
- `@stdlib/string/for-each`
    - `@stdlib/string/base/for-each`
    - `@stdlib/string/base/for-each-code-point`
    - `@stdlib/string/base/for-each-grapheme-cluster`

## Tasks

The following packages should be refactored to use the proposed solution:

- [x] `@stdlib/string/first`
- [x] `@stdlib/string/for-each`
- [ ] `@stdlib/string/left-trim-n`
- [x] `@stdlib/string/remove-first` - https://github.com/stdlib-js/stdlib/pull/1073
- [x] `@stdlib/string/remove-last` - https://github.com/stdlib-js/stdlib/pull/1079
- [x] `@stdlib/string/reverse` - https://github.com/stdlib-js/stdlib/pull/1082
- [ ] `@stdlib/string/right-trim-n`
- [ ] `@stdlib/string/truncate` - https://github.com/stdlib-js/stdlib/pull/1097
- [ ] `@stdlib/string/truncate-middle`
- [ ] `@stdlib/string/base/distances/levenshtein`

The following package implementation needs to be rewritten:

- [ ] `@stdlib/string/base/prev-grapheme-cluster`

## Notes

In general, refactoring should happen in the following order:

1. Create the base package processing grapheme clusters (package name should have a `-grapheme-cluster` or `-grapheme-clusters` suffix). This is often similar to the top-level package, but stripped of input argument validation and optional arguments.
2. Create the base package for processing Unicode code units (package name should have a `-code-point` or `-code-points` suffix).
3. Create the base package for processing UTF-16 code units (if necessary, package name should have a `-code-unit` or `-code-units` suffix).
4. Refactor the top-level package to depend on the base packages and add support for specifying a `mode` option.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tracking issue: refactor string packages handling grapheme clusters in terms of "base" packages #1062

Overview

Solution

Prior Art

Tasks

Notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Tracking issue: refactor string packages handling grapheme clusters in terms of "base" packages #1062

Description

Overview

Solution

Prior Art

Tasks

Notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions