Building a simplified webpack clone

October 2, 2020
JavaScriptwebpack

Background

We are trying out a new form of our weekly sharing, which is interest group-based.

I am hosting the "Building a simplified webpack clone" interest group, which lasted 8 weeks, and every week, we will cover 1 concept of webpack and an assignment to implement that concept ourselves.

Prior Art

Week 1 - Resolving

Why module bundler?

We love writing small modular JS files, but that shouldn't impact the users.

Traditionally with limit on number of request connection, ๐ŸŒ slow internet speed, we want to combine all the code into 1 file -> 1 network request

why bundling

๐Ÿ•ฐ Traditionally, we concatenate the source files into 1 big output file.

But that begs the question

  • โ“ what should be the order of concatenation (files may depend on each other) ?
  • โ“ what if there's var naming conflict across files?
  • โ“ what if there's unused file?

๐Ÿ’ก That's why we need a module system to define the relationship among the JS modules

relationship within a bundle

So now, let's take a look how we can start building a module dependency graph

1. We start from an entry file.

This is the starting point of the application

2. We read the file and determine what is being imported into this file

import calculate from './calculate';
import { measure, UNITS } from '../measurements';
import formula from 'formulas';

const oneCm = measure(1, UNITS.CM);
const result = calculate(formula, oneCm);

In the example above, the following is imported:

  • './calculate'
  • '../measurements'
  • 'formulas'

we can spot the import from our human eye ๐Ÿ‘€, but how can computer ๐Ÿค– do that for us?

๐Ÿค– can parse the code in string into Abstract Syntax Tree (AST), something representing the code that ๐Ÿค– can understand.

in AST, import statement is represented by a node with:

  • type = "ImportDeclaration"
  • source.value = the filename it's trying to import

ast explorer

There are various JavaScript parser out there, here are some of them

// babel
const babel = require('@babel/core');
babel.parseSync(code);

// acorn
const acorn = require('acorn');
acorn.parse(code, { ecmaVersion: 2020, sourceType: 'module' });

// esprima
const esprima = require('esprima');
esprima.parseScript(code);

// if you just need the import & export
// es-module-lexer is blazing fast, it is written in c, and loaded through web-assembly
// is what powers vite for parsing dependencies
const { init, parse } = require('es-module-lexer');
await init;
const [imports, exports] = parse(code);

...and if you forgot about your tree-traversal algorithm ๐Ÿ˜จ, here are some libraries that can help you out

// babel
const traverse = require('@babel/traverse').default;
traverse(ast, {
  ImportDeclaration(node) {},
});

// acorn
walk.simple(ast, {
  ImportDeclaration(node) {},
});

// estree-walker
const { walk } = require('estree-walker');
walk(ast, {
  enter(node) {},
  leave(node) {},
});

Some other useful links

3. Now knowing what are the names you are importing from, you need to figure out their actual file path

that depends on

  • the current file path
  • the name you are importing from
resolve('a/b/app.js', './calculate.js');
// a/b/calculate.js
resolve('a/b/app.js', '../measurements.js');
// a/measurements.js
resolve('a/b/app.js', 'formulas');
// node_modules/formulas/src/index.js

That leads us to the Node.js Module Resolution Algorithm

It describes the steps taken to resolve the file.

there are 3 scenarios in general:

  • load as file
  • load as directory
  • load as node_modules

node js module resolution algorithm

Some other module resolution:

4๏ธโƒฃ After you figured the file path you're importing from, for each of the file, ๐Ÿ” repeat step 2๏ธโƒฃ until no more new files to be found.

Assignment

Test cases

For each test cases, we provide the entry file, and we expect

๐Ÿ“ Module

  • filepath
  • dependencies -> list of Depedencies (see below ๐Ÿ‘‡)
  • isEntryFile -> true if it is the entry file / false otherwise

๐Ÿ“ Depedencies

  • module (see above โ˜๏ธ)
  • exports -> list of var names you are importing, eg "default", "measure" ..

๐Ÿ“ If 2 module are importing the same module, both should be referring to the same module instance

moduleCFromModuleA === moduleCFromModuleB;

๐Ÿ“ Be careful with circular dependency ๐Ÿ™ˆ

Week 2 - Bundling

๐Ÿค” How do you bundle modules into 1 file?

After studying the 2 most popular bundlers, webpack and rollup, i found that the way they bundle are very different.

Both of them come a long way, I believe both has its own pros and cons

// circle.js
const PI = 3.141;
export default function area(radius) {
  return PI * radius * radius;
}

// square.js
export default function area(side) {
  return side * side;
}

// app.js
import squareArea from './square';
import circleArea from './circle';

console.log('Area of square: ', squareArea(5));
console.log('Area of circle', circleArea(5));

๐Ÿ”ญ Observation: Bundle using webpack

  • ๐Ÿ“ each module wrap in a function
  • ๐Ÿ“ a module map, module identifier as key
  • ๐Ÿ“ a runtime glue code to piece modules together
  • ๐Ÿ“ calling module function, with 2 parameters, 1 to assign the exports of the module, 1 to "require" other modules
// webpack-bundle.js
const modules = {
  'circle.js': function(__exports, __getModule) {
    const PI = 3.141;
    __exports.default = function area(radius) {
      return PI * radius * radius;
    }
  },
  'square.js': function(__exports, __getModule) {
    __exports.default = function area(side) {
      return side * side;
    }
  },
  'app.js': function(__exports, __getModule) {
    const squareArea = __getModule('square.js').default;
    const circleArea = __getModule('circle.js').default;
    console.log('Area of square: ', squareArea(5))
    console.log('Area of circle', circleArea(5))
  }
}
webpackRuntime({
  modules,
  entry: 'app.js'
});

๐Ÿ”ญ Observation: Bundle using rollup

  • ๐Ÿ“ much flatter bundle
  • ๐Ÿ“ module are concatenated in topological order
  • ๐Ÿ“ exports and imports are removed by renaming them to the same variable name
  • ๐Ÿ“ any variable in module scope that may have naming conflict with other variables are renamed
// rollup-bundle.js
const PI = 3.141;

function circle$area(radius) {
    return PI * radius * radius;
}
function square$area(side) {
    return side * side;
}

console.log('Area of square: ', square$area(5));
console.log('Area of circle', circle$area(5));

๐Ÿ“ค Output target of bundling

Assignment

Test cases

Here are some of the the interesting test cases:

๐Ÿงช Able to handle re-export nicely

// a.js
export * as b from './b';
export * from './c';
export { d } from './d';

// main.js
import * as a from './a';

console.log(a);

๐Ÿงช Importing the same file twice, but are you able to make sure it's gonna be evaluated only once?

// a.js
import './c';

// b.js
import './c';

// c.js
console.log('c.js');

// main.js
import './a';
import './b';

๐Ÿงช The dreaded circular dependency, are you able to make sure to get the value of a, b, c in all the files?

// a.js
import { b } from './b';
import { c } from './c';

export const a = 'a';

setTimeout(() => {
  console.log(`a.js | b=${b} | c=${c}`);
});

// b.js
import { a } from './a';
import { c } from './c';

export const b = 'b';

setTimeout(() => {
  console.log(`b.js | a=${a} | c=${c}`);
});

// c.js
import { a } from './a';
import { b } from './b';

export const c = 'c';

setTimeout(() => {
  console.log(`c.js | a=${a} | b=${b}`);
});

// main.js
import { a } from './a';
import { b } from './b';
import { c } from './c';

setTimeout(() => {
  console.log(`main.js | a=${a} | b=${b} | c=${c}`);
});

๐Ÿงช Are you able to export a variable before it is declared? Does the order matter?

// a.js
let a = 'a';

export { a, b };

let b = 'b';

// main.js
import { a, b } from './a';

console.log('a = ' + a);
console.log('b = ' + b);

๐Ÿงช imported variables is not a normal variable, it's a live binding of the exported variable. Are you able to make sure that the value of count is always up to date?

// data.js
export let count = 1;

export function increment() {
  count++;
}

// a.js
import { count, increment } from './data';

console.log('count = ' + count);
increment();
console.log('count = ' + count);

// b.js
import { count, increment } from './data';

console.log('count = ' + count);
increment();
console.log('count = ' + count);

// main.js
import './a';
import './b';

๐Ÿ“ Be careful with circular dependency ๐Ÿ™ˆ

๐Ÿ”จ Manipulating AST

๐Ÿ“– manipulating ast with javascript (generic) ๐Ÿ“– babel plugin handbook (babel)