Background
We are trying out a new form of our weekly sharing, which is interest group-based.
I am hosting the "Building a simplified webpack clone" interest group, which lasted 8 weeks, and every week, we will cover 1 concept of webpack and an assignment to implement that concept ourselves.
Prior Art
- ๐บ Tobias Koppers - bundling live by hand - https://youtube.com/watch?v=UNMkLHzofQI
- ๐บ Ronen Amiel - build your own webpack - https://youtube.com/watch?v=Gc9-7PBqOC8
- ๐ adam kelly - https://freecodecamp.org/news/lets-learn-how-module-bundlers-work-and-then-write-one-ourselves-b2e3fe6c88ae/
Week 1 - Resolving
Why module bundler?
We love writing small modular JS files, but that shouldn't impact the users.
Traditionally with limit on number of request connection, ๐ slow internet speed, we want to combine all the code into 1 file -> 1 network request
๐ฐ Traditionally, we concatenate the source files into 1 big output file.
But that begs the question
- โ what should be the order of concatenation (files may depend on each other) ?
- โ what if there's var naming conflict across files?
- โ what if there's unused file?
๐ก That's why we need a module system to define the relationship among the JS modules
So now, let's take a look how we can start building a module dependency graph
1. We start from an entry file.
This is the starting point of the application
2. We read the file and determine what is being imported into this file
import calculate from './calculate';
import { measure, UNITS } from '../measurements';
import formula from 'formulas';
const oneCm = measure(1, UNITS.CM);
const result = calculate(formula, oneCm);
In the example above, the following is imported:
'./calculate'
'../measurements'
'formulas'
we can spot the import from our human eye ๐, but how can computer ๐ค do that for us?
๐ค can parse the code in string into Abstract Syntax Tree (AST), something representing the code that ๐ค can understand.
in AST, import statement is represented by a node with:
type
= "ImportDeclaration"source.value
= the filename it's trying to import
There are various JavaScript parser out there, here are some of them
- ๐ babel
- ๐ acorn
- ๐ esprima
- ๐ es-module-lexer
// babel
const babel = require('@babel/core');
babel.parseSync(code);
// acorn
const acorn = require('acorn');
acorn.parse(code, { ecmaVersion: 2020, sourceType: 'module' });
// esprima
const esprima = require('esprima');
esprima.parseScript(code);
// if you just need the import & export
// es-module-lexer is blazing fast, it is written in c, and loaded through web-assembly
// is what powers vite for parsing dependencies
const { init, parse } = require('es-module-lexer');
await init;
const [imports, exports] = parse(code);
...and if you forgot about your tree-traversal algorithm ๐จ, here are some libraries that can help you out
- ๐ babel-traverse
- ๐ acorn-walk
- ๐ estree-walker
// babel
const traverse = require('@babel/traverse').default;
traverse(ast, {
ImportDeclaration(node) {},
});
// acorn
walk.simple(ast, {
ImportDeclaration(node) {},
});
// estree-walker
const { walk } = require('estree-walker');
walk(ast, {
enter(node) {},
leave(node) {},
});
Some other useful links
- Inspect your AST
- The JS AST Specification
- Guide on parsing, traversing AST
3. Now knowing what are the names you are importing from, you need to figure out their actual file path
that depends on
- the current file path
- the name you are importing from
resolve('a/b/app.js', './calculate.js');
// a/b/calculate.js
resolve('a/b/app.js', '../measurements.js');
// a/measurements.js
resolve('a/b/app.js', 'formulas');
// node_modules/formulas/src/index.js
That leads us to the Node.js Module Resolution Algorithm
It describes the steps taken to resolve the file.
there are 3 scenarios in general:
- load as file
- load as directory
- load as node_modules
Some other module resolution:
- webpack uses
enhanced-resolve
which is a highly configurable resolver - Typescript implements its own resolver, see how TS resolving works
4๏ธโฃ After you figured the file path you're importing from, for each of the file, ๐ repeat step 2๏ธโฃ until no more new files to be found.
Assignment
For each test cases, we provide the entry file, and we expect
๐ Module
filepath
dependencies
-> list of Depedencies (see below ๐)isEntryFile
->true
if it is the entry file /false
otherwise
๐ Depedencies
module
(see above โ๏ธ)exports
-> list of var names you are importing, eg "default", "measure" ..
๐ If 2 module are importing the same module, both should be referring to the same module instance
moduleCFromModuleA === moduleCFromModuleB;
๐ Be careful with circular dependency ๐
Week 2 - Bundling
๐ค How do you bundle modules into 1 file?
After studying the 2 most popular bundlers, webpack and rollup, i found that the way they bundle are very different.
Both of them come a long way, I believe both has its own pros and cons
// circle.js
const PI = 3.141;
export default function area(radius) {
return PI * radius * radius;
}
// square.js
export default function area(side) {
return side * side;
}
// app.js
import squareArea from './square';
import circleArea from './circle';
console.log('Area of square: ', squareArea(5));
console.log('Area of circle', circleArea(5));
๐ญ Observation: Bundle using webpack
- ๐ each module wrap in a function
- ๐ a module map, module identifier as key
- ๐ a runtime glue code to piece modules together
- ๐ calling module function, with 2 parameters, 1 to assign the exports of the module, 1 to "require" other modules
// webpack-bundle.js
const modules = {
'circle.js': function(__exports, __getModule) {
const PI = 3.141;
__exports.default = function area(radius) {
return PI * radius * radius;
}
},
'square.js': function(__exports, __getModule) {
__exports.default = function area(side) {
return side * side;
}
},
'app.js': function(__exports, __getModule) {
const squareArea = __getModule('square.js').default;
const circleArea = __getModule('circle.js').default;
console.log('Area of square: ', squareArea(5))
console.log('Area of circle', circleArea(5))
}
}
webpackRuntime({
modules,
entry: 'app.js'
});
๐ญ Observation: Bundle using rollup
- ๐ much flatter bundle
- ๐ module are concatenated in topological order
- ๐ exports and imports are removed by renaming them to the same variable name
- ๐ any variable in module scope that may have naming conflict with other variables are renamed
// rollup-bundle.js
const PI = 3.141;
function circle$area(radius) {
return PI * radius * radius;
}
function square$area(side) {
return side * side;
}
console.log('Area of square: ', square$area(5));
console.log('Area of circle', circle$area(5));
๐ค Output target of bundling
IIFE (the most common target, we want to execute the script)
CJS, ESM, UMD, AMD, ... (we want to bundle a library, exports of entry file is exported in selected module format)
๐ https://webpack.js.org/configuration/output/#outputlibrarytarget
Assignment
Here are some of the the interesting test cases:
๐งช Able to handle re-export nicely
// a.js
export * as b from './b';
export * from './c';
export { d } from './d';
// main.js
import * as a from './a';
console.log(a);
๐งช Importing the same file twice, but are you able to make sure it's gonna be evaluated only once?
// a.js
import './c';
// b.js
import './c';
// c.js
console.log('c.js');
// main.js
import './a';
import './b';
๐งช The dreaded circular dependency, are you able to make sure to get the value of a
, b
, c
in all the files?
// a.js
import { b } from './b';
import { c } from './c';
export const a = 'a';
setTimeout(() => {
console.log(`a.js | b=${b} | c=${c}`);
});
// b.js
import { a } from './a';
import { c } from './c';
export const b = 'b';
setTimeout(() => {
console.log(`b.js | a=${a} | c=${c}`);
});
// c.js
import { a } from './a';
import { b } from './b';
export const c = 'c';
setTimeout(() => {
console.log(`c.js | a=${a} | b=${b}`);
});
// main.js
import { a } from './a';
import { b } from './b';
import { c } from './c';
setTimeout(() => {
console.log(`main.js | a=${a} | b=${b} | c=${c}`);
});
๐งช Are you able to export a variable before it is declared? Does the order matter?
// a.js
let a = 'a';
export { a, b };
let b = 'b';
// main.js
import { a, b } from './a';
console.log('a = ' + a);
console.log('b = ' + b);
๐งช imported variables is not a normal variable, it's a live binding of the exported variable. Are you able to make sure that the value of count
is always up to date?
// data.js
export let count = 1;
export function increment() {
count++;
}
// a.js
import { count, increment } from './data';
console.log('count = ' + count);
increment();
console.log('count = ' + count);
// b.js
import { count, increment } from './data';
console.log('count = ' + count);
increment();
console.log('count = ' + count);
// main.js
import './a';
import './b';
๐ Be careful with circular dependency ๐
๐จ Manipulating AST
๐ manipulating ast with javascript (generic) ๐ babel plugin handbook (babel)