Description
JavaScript module systems in TypeScript
This brief write up is an attempt to capture how RequireJS and CommonJS module loading in JavaScript files may be modeled when powered by the TypeScript language service. It will outline some common usages as reference points, and then outline a high-level algorithm to provide typing in the examples given. A description of how this algorithm may be implemented within the TypeScript parser then follows.
(See parent issue #4789 for an overview)
Example RequireJS usage
The below examples outline common RequireJS usage patterns, and use inline comments for annotations explaining the resulting types.
Canonical RequireJS style with a dependency array
define(['./foo','bar'],
/**
* @param {module: ./foo} foo
* @param {module: bar} bar
* @return {Object}
*/
function(foo, bar){
return {
// Shape of this object defines the module
};
}
);
Using the CommonJS style
define(
// Note: No dependency array provided
/**
* The order and name of the parameters is important
* @param {RequireJS.require} require
* @param {Object} [exports]
* @param {RequireJS.module} [module]
* @return {typeof exports}
*/
function(require, exports, module){
// Note that inside this function, the code is the same as for the CommonJS module system
// fs is the imported module as per: import * as fs from 'fs'
var fs = require('fs');
// The "exports" object is the resulting module
exports.prop = "test";
}
)
Combined dependency array and CommonJS style usage
define(
// The CommonJS wrapper names may also be provided as dependencies
['vs/editor', 'exports'],
/**
* @param {module: vs/editor} ed
* @param {Object} exports
* @return {typeof exports}
*/
function(ed, exports){
// Shape of the resulting "exports" object is the shape of the module
exports.prop = "test";
}
);
RequireJS plugins
RequireJS allows for plugins to be provided as modules. These follow a syntax of having the plugin module name, followed by a bang, followed by a 'resource name', e.g.
define(["i18n!my/nls/colors"],
/**
* @param {module: i18n} colors
*/
function(colors) {
return {
testMessage: "The name for red in this locale is: " + colors.red
}
});
Modules defined as simple object literals
At its simplest, a call to "define" can just contain an object literal for the resulting module.
define(
// If the final param to "define" is just an object rather than a function, that is the module
{
"root": {
"red": "red",
"blue": "blue",
"green": "green"
}
}
);
Type inference for AMD modules
Note: The below is an abstract description of the algorithm. An implementation specific description, based on the current TypeScript codebase, follows.
To process AMD modules, the file must be a JavaScript file (i.e. have an extension of either 'js' or 'jsx') and the module type must be set to 'amd'. If these conditions are met, 'define' calls are processed in the following manner:
- Let 'args' be the array of the arguments passed to the 'define' function, and 'M' represent the external module type to be constructed.
- If the 1st element of 'args' is a string, register the module 'M' with this name (as for a
declare module "name" {...}
TypeScript statement), else register the module 'M' based on the current file path (as for a TypeScript script with top level import/export statements). Perform a 'shift' operation on 'args' (i.e. remove the 1st element, make the 2nd the 1st, etc..). - If the 1st element of 'args' is an expression resulting in an object type (e.g. an object literal), then assign the type of the expression to the module 'M' and skip remaining steps.
- If the 1st element of 'args' is an array of strings, process as follows and then perform a 'shift' on 'args':
- Place the string names in an ordered list of
{'name': string, 'type': any}
tuples. Call this list 'imports'. - For each tuple in 'imports':
- If 'name' is 'require', assign to 'type' the type 'RequireJS.require' (this "known" type will be used below).
- If 'name' is 'exports', assign to 'type' the type 'RequireJS.exports' (this "known" type will be used below).
- If 'name' is 'module', assign to 'type' the type 'RequireJS.module' (this "known" type will be used below).
- For any other name, resolve the module as for a ES6 external module import (e.g.
import * as x from 'name';
), and assign to 'type' the resulting type ofx
.
- Place the string names in an ordered list of
- If the 1st element of 'args' is a function, process as follows:
- If any of the parameters to the function have a JsDoc annotation which specified a type, then assign the parameter this type and exclude that parameter from the remaining steps.
- If the 'imports' list is undefined, check the parameter list to see if it matches the list
['require', 'exports', 'module']
where the 2nd and 3rd are optional. If so, assign to the 'imports' list the corresponding 3 types as per steps 4.ii.a to 4.ii.c. - Assign the function parameters the types from the 'imports' list in order. Error if there are more parameters that entries in 'imports'. (To be error tolerant, just assign the 'any' type rather than error).
- If the function body contains an expression of type 'RequireJS.require', which is called as a function with a string argument, then:
- Let 'name' be the string value
- Let 'module' be the value
x
as evaluated in the ES6 import:import * as x from 'name'
- Assign the type of 'module' as the type of the function call expression (i.e. the "required" module type)
- If the function body contains
return
statements that return a value, then the type of the module 'M' is the best common type of the return statements, and skip remaining steps. - If the function body contains an expression of type 'RequireJS.module', and the property
exports
is assigned to, then the type of the module 'M' is the best common type of any assignments to theexports
property, and skip remaining steps. - If the function body contains expressions of type 'RequireJS.exports', then:
- For each property assignment onto the type with either a constant string indexer (e.g.
exports["foo-bar"] = 42;
) or a valid identifier (e.g.exports.foo = new Date();
), then create a property on the module type 'M' of that name as would be done for an ES6 export (e.g.export var foo = new Date();
). - For other assignments (e.g. computed expressions such as
exports[myVar] = true;
), ignore. - Skip remaining steps.
- For each property assignment onto the type with either a constant string indexer (e.g.
- If this step is reached, then error on the arguments provided to 'define'.
Type inference for CommonJS modules
Note that this is a subset of the AMD scenarios, and behaves identically to code in an AMD define
call when using the CommonJS pattern.
Other RequireJS API calls
Other calls in RequireJS to be handled specially are when invoking the identifiers 'require' or 'requirejs' as a function. Note that these point to the same function, so treatment is identical. Only 'require' will be discussed below for clarity.
Calls to 'require' are typically used in the data-main entry point to load the first app module and run some code on load. Thus, the main difference between 'require' and 'define' is minimal, and indeed its usage pattern is very similar to the first 'canonical' example of RequireJS usage at the start of this Gist. The processing of 'require' calls differs from the calls to 'define' only in that step #2 in the algorithm above is skipped. That is, you cannot provide a string as the first argument to define a module name, and no external module is registered.
Notes
- If multiple calls to 'define' without a string as the first argument are present, the last call will define the module. (Could error here, but designed to be tolerant).
- The algorithm prioritizes return statement values, over 'module.exports' assignments, over settings properties on the 'exports' object, to determine the module type. This may or may not be accurate depending on code flow, but is a simplification for the common case (where mixing of these methods in a module definition is unusual, and behavior in the presense of multiple methods is not documented by RequireJS).
Implementation in TypeScript
Scripts compiled with the module type 'amd' are handled differently than other types of module systems, in that they contain code in global scope, yet function expressions within a 'define' call declare an external module.
The process goes through the usual pipeline:
- Parsing
- Binding
- Type checking
Parsing
In the parsing phase, declarations are marked. For script compiled with module type 'amd', there is special handling whenever a call to define
is encountered. If the call has a string as the first param, it is a declaration for the module named by the string. If not, it is a declaration for a module of the name of the script being parsed (e.g. ./src/foo.js
). The function expression (or object literal) which is the last parameter to define
is marked as the module declaration.
Binding
In the binding phase, identifiers are bound to their declarations. This is handled specially for function expressions within require
or define
calls.
- If the function parameters are named
require
,exports
,module
, in that order (with only the first required), and the prior argument todefine
orrequire
was not an array, then the parameters are automatically bound to theRequireJS.require
,RequireJS.exports
, andRequireJS.module
declarations. - Otherwise if there are function parameters, then the prior argument must be a string array. Each parameter in the function expression is bound to the module name given by the corresponding element in the array (the module names
require
,exports
, andmodule
are already declared internally and map to types as outlined above).
Handling calls to 'require'
Calls to require
are used commonly in two places: Inside module definitions using the CommonJS pattern, and outside module definitions to initiate module loading. These usages have unique signatures.
In order for type checking to handle the CommonJS usage pattern (where the var x = require("modulename")
expression is used), binding of this expression must be handled specially. Specifically, when an expression of type RequireJS.require
is invoked as a function with a single string literal argument, then the call expression should be bound as for an imported module of that name (as per step #2 above). (OPEN: This might be better handled at the type checking phase when the type of x
is pulled).
In order for type checking to handle the module loading usage pattern, the binding is as per the define
binding when the first argument is either an array or a function expression. A major difference being that calls to require
do not define a module. (Note also that calls to the requirejs
function are equivalent to calls to require
).
TODO: Calls to require can contain a config
object as the first param. Per implementation, valid signature usage appears to be:
require(moduleName: string); // CommonJS style
require(deps: string[], callback?: Function, errback?: Function); // Loader call
// Note: Usage of config object as first param seems like a deprecated pattern. May not be needed.
require(config: Object, deps?: string[], callback?: Function, errback?: Function); // Config object
require(config: Object, callback?: Function, errback?: Function); // Dependencies in config object
Type checking
When a module declaration is pulled for type checking, the following process occurs:
- If the declaration is an object literal, then the type of the module declaration is evaluated as for any other object literal expression. Else the declaration must be a function expression.
- If the function expression contains
return <expression>;
statements, then the type of the module is the best common type of the return expressions. - Else if the function expression contains a parameter of type
RequireJS.module
, then the function body is searched for any assignments to theexport
property of the parameter. If there are 1 or more, then the type of the module is the best common type of the types of the expressions assigned to theexport
property. - Else if the function expression contains a parameter of type
RequireJS.exports
, then the function body is searched for any assignments to properties on this parameter (where property names must be assigned with string constants or IdentifierName tokens). A type is constructed, starting with an empty object, by adding each proporty assigned toexports
as a member (of the type of the expression assigned to the property). The type of the module is the final type of the constructed object. - Else the module type is
Object
.
OPEN
- Currently TypeScript maps an external module implementation to a script file. This will need to be changed for one or more external modules to be declared within a script (which is treated as global outside the 'define' calls).
- Check if the implementation as it stands allows for the parameters in the function expressions passed to
define
andrequire
to be bound to the special 'RequireJS.*' declarations.
TODO
- What to do about important settings for processing such as
baseUrl
? (Current thoughts, add a tsconfig.json setting for it, rather than try to infer from code, which is difficult as you may not come across the file setting the baseUrl until you have already processed some module files). - What to do about other config settings, such as
shim
,paths
,map
, etc... (Current thoughts, nothing unless there is strong demand).