Skip to content

added function bytes to Utf8 #4

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions src/cjs/browser.cjs
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
"use strict";
Object.defineProperty(exports, "__esModule", { value: true });
exports.compare = exports.fromHex = exports.toHex = void 0;
exports.compare = exports.fromHex = exports.toHex = exports.toUtf8 = void 0;
const HEX_STRINGS = "0123456789abcdefABCDEF";
const HEX_CODES = HEX_STRINGS.split("").map((c) => c.codePointAt(0));
const HEX_CODEPOINTS = Array(256)
Expand All @@ -12,7 +12,11 @@ const HEX_CODEPOINTS = Array(256)
return index < 0 ? undefined : index < 16 ? index : index - 6;
});
const ENCODER = new TextEncoder();
const DECODER = new TextDecoder("ascii");
const DECODER = new TextDecoder();
function toUtf8(bytes) {
return DECODER.decode(bytes);
}
exports.toUtf8 = toUtf8;
// There are two implementations.
// One optimizes for length of the bytes, and uses TextDecoder.
// One optimizes for iteration count, and appends strings.
Expand Down
6 changes: 5 additions & 1 deletion src/cjs/index.cjs
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
"use strict";
Object.defineProperty(exports, "__esModule", { value: true });
exports.compare = exports.fromHex = exports.toHex = void 0;
exports.compare = exports.fromHex = exports.toHex = exports.toUtf8 = void 0;
function toUtf8(bytes) {
return Buffer.from(bytes || []).toString();
}
exports.toUtf8 = toUtf8;
function toHex(bytes) {
return Buffer.from(bytes || []).toString("hex");
}
Expand Down
1 change: 1 addition & 0 deletions src/cjs/index.d.ts
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
export declare function toUtf8(bytes: Uint8Array): string;
export declare function toHex(bytes: Uint8Array): string;
export declare function fromHex(hexString: string): Uint8Array;
export declare type CompareResult = -1 | 0 | 1;
Expand Down
5 changes: 4 additions & 1 deletion src/mjs/browser.js
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,10 @@ const HEX_CODEPOINTS = Array(256)
return index < 0 ? undefined : index < 16 ? index : index - 6;
});
const ENCODER = new TextEncoder();
const DECODER = new TextDecoder("ascii");
const DECODER = new TextDecoder();
export function toUtf8(bytes) {
return DECODER.decode(bytes);
}
// There are two implementations.
// One optimizes for length of the bytes, and uses TextDecoder.
// One optimizes for iteration count, and appends strings.
Expand Down
3 changes: 3 additions & 0 deletions src/mjs/index.js
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
export function toUtf8(bytes) {
return Buffer.from(bytes || []).toString();
}
export function toHex(bytes) {
return Buffer.from(bytes || []).toString("hex");
}
Expand Down
6 changes: 5 additions & 1 deletion ts_src/browser.ts
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,11 @@ const HEX_CODEPOINTS: (number | undefined)[] = Array(256)
return index < 0 ? undefined : index < 16 ? index : index - 6;
});
const ENCODER = new TextEncoder();
const DECODER = new TextDecoder("ascii");
const DECODER = new TextDecoder();

export function toUtf8(bytes: Uint8Array): string {
return DECODER.decode(bytes);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TextDecoder is not returning utf8 strings.

Uint8Array.from([ 227, 129, 147, 227, 130, 147, 227, 129, 171, 227, 129, 161, 227, 129, 175 ])

This should equal "こんにちは" but it doesn't. It returns "ã\x81“ã‚“ã\x81«ã\x81¡ã\x81¯"

Can you figure out why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay.. i'll look into it

}

// There are two implementations.
// One optimizes for length of the bytes, and uses TextDecoder.
Expand Down
4 changes: 4 additions & 0 deletions ts_src/index.ts
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
export function toUtf8(bytes: Uint8Array): string {
return Buffer.from(bytes || []).toString();
}

export function toHex(bytes: Uint8Array): string {
return Buffer.from(bytes || []).toString("hex");
}
Expand Down
14 changes: 14 additions & 0 deletions ts_src/tests.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,14 @@ const bytes2Larger = f([0xff, 0x01, 0x00]);
const bytes2LargerLeft = f([0x00, 0xff, 0x01]);
const longBytes = new Uint8Array(513).fill(0xfa);
const longHex = "fa".repeat(513);
const bytes3 = f([0x21, 0x7e]);
const utf8 = "!~";
const longBytes2 = new Uint8Array(513).fill(0x61);
const longUtf8 = "a".repeat(513);
const testBytes = f([
227, 129, 147, 227, 130, 147, 227, 129, 171, 227, 129, 161, 227, 129, 175,
]);
const str = "こんにちは";

const brokenHexes = [
[" ff00", f([]), "leading space"],
Expand All @@ -39,6 +47,12 @@ describe(`Uint8Array tools`, () => {
expect(tools.toHex(longBytes)).toEqual(longHex);
expect((tools.toHex as any)()).toEqual("");
});
it(`should output utf8 with toUtf8`, () => {
expect(tools.toUtf8(bytes3)).toEqual(utf8);
expect(tools.toUtf8(testBytes)).toEqual(str);
expect(tools.toUtf8(longBytes2)).toEqual(longUtf8);
expect((tools.toUtf8 as any)()).toEqual("");
});
it(`should compare Uint8Arrays`, () => {
expect(tools.compare(bytes, bytes2)).toBe(-1);
expect(tools.compare(bytes, bytes)).toBe(0);
Expand Down