Skip to content

Support a type representing any literal string, a la Python's LiteralString type #51513

Open
@ethanresnick

Description

@ethanresnick

🔍 Search Terms

literal string, xss, sql injection, security, user input handling

✅ Viability Checklist

My suggestion meets these guidelines:

  • This wouldn't be a breaking change in existing TypeScript/JavaScript code
    • Mostly satisfied: some tiny number of existing programs might break depending on the name chosen for the type.
  • This wouldn't change the runtime behavior of existing JavaScript code
  • This could be implemented without emitting different JS based on the types of the expressions
  • This isn't a runtime feature (e.g. library functionality, non-ECMAScript syntax with JavaScript output, new syntax sugar for JS, etc.)
  • This feature would agree with the rest of TypeScript's Design Goals.

⭐ Suggestion + Motivating Example

The idea is to add a built-in type called LiteralString, which would be the supertype of all literal string types. Ie, LiteralString is inhabited by all the subtypes of string, excluding string itself and template string types that contain string. In addition to introducing this type, TS would be more careful about tracking whether a string has a literal type (eg, when two strings with literal types are concatenated with +, the result would remain a literal type, rather than becoming string).

The motivation here is to allow the type system to check that certain security-sensitive strings haven't been unsafely manipulated by user-controlled input. For example, one could write a function like queryDb(query: LiteralString, params?: unknown[]): Promise<Results> to enforce that the query string does not have any values interpolated into it that could've been user-controlled and created SQL injection vulnerabilities. The idea is that the value from user input would’ve had to be typed as string, which can’t be mixed into a LiteralString without producing a string, which would then not be an acceptable input to queryDb:

// `id` is type `string`, so the type of 
// this argument is `string`, so the call is not allowed 
queryDb(`SELECT * from a where id = ${id}`)

// however, this type checks, as the first argument is
// inferred as either a literal type (matching its value) or, 
// through contextual typing, as LiteralString
queryDb('SELECT * from a where id = ?', [id])

There is a bunch of prior art for such a type, with identical motivation, including the LiteralString type in Python. There was also a proposal to have JS engines track whether a string was created entirely from literals, which would've been used to allow DOM APIs like innerHTML to treat literal strings as safe, as part of a broader strategy to protect against XSS. (Of course, this TS proposal is compile-time only, but the motivation is the same.) Additionally, there was/is an analogous type in Google's Closure Compiler, with the same motivation. Finally, Scala has an analogous type, Singleton, which is inhabited by all literal types.

Potentially, the built-in type could be calledLiteral, rather than LiteralString, and could also include other kinds of literals (numbers, bigints, etc); APIs which need a string would then do Literal & string, or TS could provide LiteralString as a built-in alias.

I guess there's an argument that tracking all literal values in the same way, and having a unified Literal type, is more elegant, and perhaps there are some use cases outside of security for which such a type would be valuable. For the security use case, though, if an API takes a non-string, and you pass user input to that API (or some value derived from user input), it seems almost certain that you intended to let the user control the API with their input. In these non-string cases, there's nothing analogous to the "you intended to allow the user to provide some data, but they tricked the system into interpreting that data as code" problem that's at the heart of SQL injection, XSS, and related vulnerabilities.

Given all that, I guess I'd propose starting with only LiteralString, as that's presumably less effort to implement and adds less overhead to compile times. If legitimate use cases for a more general Literal type arise, then it's easy to implement that later and redefine LiteralString as Literal & string.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Awaiting More FeedbackThis means we'd like to hear from more people who would be helped by this featureSuggestionAn idea for TypeScript

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions