Skip to content

make acceptedLanguages order the header by q values descending #262

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
25 changes: 21 additions & 4 deletions fluent-langneg/src/accepted_languages.js
Original file line number Diff line number Diff line change
@@ -1,7 +1,24 @@
export default function acceptedLanguages(string = "") {
if (typeof string !== "string") {
export default function acceptedLanguages(acceptLanguageHeader = "") {
if (typeof acceptLanguageHeader !== "string") {
throw new TypeError("Argument must be a string");
}
const tokens = string.split(",").map(t => t.trim());
return tokens.filter(t => t !== "").map(t => t.split(";")[0]);
const tokens = acceptLanguageHeader.split(",").map(t => t.trim());
const langsWithQ = [];
tokens.filter(t => t !== "").forEach((t, index) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please move the filter to line 5 together with trimming?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

const langWithQ = t.split(";").map(u => u.trim());
if (langWithQ[0].length > 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String.prototype.split always returns at least one element, so no need for this check I think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that's true -- removed

let q = 1.0;
if (langWithQ.length > 1) {
const qVal = langWithQ[1].split("=").map(u => u.trim());
if (qVal.length === 2 && qVal[0].toLowerCase() === "q") {
const qn = Number(qVal[1]);
q = !isNaN(qn) ? qn : q;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That means that any non-numerical q value will get replaced with 1.0, is that what we should do? Seems like bogus input should result in low priority, not high.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fair -- I've replaced it with 0.0. (I was using q=1 here because it's the default value in the specification)

}
}
langsWithQ.push({ index: index, lang: langWithQ[0], q });
}
});
// order by q descending, keeping the header order for equal weights
langsWithQ.sort((a, b) => a.q === b.q ? a.index - b.index : b.q - a.q);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking cleanup suggestion:

function parseAcceptLanguageEntry(entry) {
  const langWithQ = entry.split(";").map(u => u.trim());
  let q = 1.0;
  if (langWithQ.length > 1) {
    const qVal = langWithQ[1].split("=").map(u => u.trim());
    if (qVal.length === 2 && qVal[0].toLowerCase() === "q") {
      const qn = Number(qVal[1]);
      q = isNaN(qn) ? 0.0 : qn;
    }
  }
  return { lang: langWithQ[0], q };
}

const langsWithQ = Array.from(tokens.map(parseAcceptLanguageEntry).entries());
// order by q descending, keeping the header order for equal weights
langsWithQ.sort(([aidx, aval], [bidx, bval]) => aval.q === bval.q ? aidx - bidx : bval.q - aval.q);
return langsWithQ.map(([idx, val]) => val.lang);

I hand-written it so it may not launch but I hope the intention is expressed in my snippet. I'm trying to save us from reallocating the array in the loop. Alternatively, you could also likely do const langsWithQ = Array(tokens.length); which I guess is simpler.
The other thing I wanted to improve in my snippet is to not carry around the index, since the langsWithQ preserves the order of tokens, so there's no value in storing it on the langWithQ struct.

@blushingpenguin - do you like the proposed changes or would you prefer to land as-is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicer your way I think -- I wasn't aware of Array.entries. I have updated the code and retested it.

return langsWithQ.map(t => t.lang);
}
47 changes: 47 additions & 0 deletions fluent-langneg/test/headers_test.js
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,53 @@ suite('parse headers', () => {
);
});

test('with out of order quality values', () => {
assert.deepStrictEqual(
acceptedLanguages('en;q=0.8, fr;q=0.9, de;q=0.7, *;q=0.5, fr-CH'), [
'fr-CH',
'fr',
'en',
'de',
'*'
]
);
});

test('with equal q values', () => {
assert.deepStrictEqual(
acceptedLanguages('en;q=0.1, fr;q=0.1, de;q=0.1, *;q=0.1'), [
'en',
'fr',
'de',
'*'
]
);
});

test('with duff q values', () => {
assert.deepStrictEqual(
acceptedLanguages('en;q=no, fr;z=0.9, de;q=0.7;q=9, *;q=0.5, fr-CH;q=a=0.1'), [
'en',
'fr',
'fr-CH',
'de',
'*'
]
);
});

test('with empty entries', () => {
assert.deepStrictEqual(
acceptedLanguages('en;q=0.8,,, fr;q=0.9,, de;q=0.7, *;q=0.5, fr-CH'), [
'fr-CH',
'fr',
'en',
'de',
'*'
]
);
});

test('edge cases', () => {
const args = [
null,
Expand Down