14 min read
Writing a Custom TypeScript AST Transformer

Introduction

In my first blog post I would like to walk through a problem I solved recently using TypeScript’s compiler API. I am certain that I would not have been able to get something working without the help of various blogs and StackOverflow answers, so it felt quite selfish to not write about my learnings around a powerful yet lightly documented set of tools.

Topics touched on

TypeScript compiler API basics (parser terminology, transformer API, layered architecture), ASTs, visitor pattern, code generation.

Prerequisite

Vaidehi Joshi has a great article on ASTs that I would suggest reading if you are unfamiliar with the concept. Her basecs series is wonderful and you should check it out.

Problem I am solving

We are using GraphQL at Avero and have been wanting to add some type-safety around resolvers. I came across graphqlgen, which solved a lot of these problems I was having with its concept of models. I don’t want to dive too deeply into this topic in this blog post, but I hope to write up something in the future that dives into GQL. The tldr is that models represent the return value of your query resolvers (which may differ from your GQL schema), and in graphqlgen you associate these models to interfaces using some sort of configuration (YAML or TypeScript file with type declarations).

At work we run gRPC microservices, and GQL mostly serves as a nice fanout layer for our UI consumers. We already publish TypeScript interfaces that match our proto contracts and I wanted to consume these types in graphqlgen to serve as our models, but ran into some issues due to type export support and with the way our TypeScript interfaces published (heavily namespaced, lots of references).

Like any good open-source citizen, my first approach was to leverage the work already done in the graphqlgen repo and attempt to add a meaningful contribution. To do its type introspection, graphqlgen uses @babel/parser to read the TypeScript (in my case) file and collect information about interface names and declarations (the fields of the interface).

Anytime I want to do anything with ASTs, I immediately pull up astexplorer.net and start playing around. This tool allows us to explore the ASTs generated by many different parsers, including both @babel/parser and the TypeScript compiler parser. It gives us a great way to visualize the data structures we will be working with and familiarize yourself with the types of AST nodes for that given parser.

Let’s take a look at an example input file and corresponding AST using babel-parser:

// user.ts
import { protos } from "my_company_protos";

export type User = protos.user.User;
{
  "type": "Program",
  "start": 0,
  "end": 80,
  "loc": {
    "start": {
      "line": 1,
      "column": 0
    },
    "end": {
      "line": 3,
      "column": 36
    }
  },
  "comments": [],
  "range": [0, 80],
  "sourceType": "module",
  "body": [
    {
      "type": "ImportDeclaration",
      "start": 0,
      "end": 42,
      "loc": {
        "start": {
          "line": 1,
          "column": 0
        },
        "end": {
          "line": 1,
          "column": 42
        }
      },
      "specifiers": [
        {
          "type": "ImportSpecifier",
          "start": 9,
          "end": 15,
          "loc": {
            "start": {
              "line": 1,
              "column": 9
            },
            "end": {
              "line": 1,
              "column": 15
            }
          },
          "imported": {
            "type": "Identifier",
            "start": 9,
            "end": 15,
            "loc": {
              "start": {
                "line": 1,
                "column": 9
              },
              "end": {
                "line": 1,
                "column": 15
              },
              "identifierName": "protos"
            },
            "name": "protos",
            "range": [9, 15],
            "_babelType": "Identifier"
          },
          "importKind": null,
          "local": {
            "type": "Identifier",
            "start": 9,
            "end": 15,
            "loc": {
              "start": {
                "line": 1,
                "column": 9
              },
              "end": {
                "line": 1,
                "column": 15
              },
              "identifierName": "protos"
            },
            "name": "protos",
            "range": [9, 15],
            "_babelType": "Identifier"
          },
          "range": [9, 15],
          "_babelType": "ImportSpecifier"
        }
      ],
      "importKind": "value",
      "source": {
        "type": "Literal",
        "start": 23,
        "end": 42,
        "loc": {
          "start": {
            "line": 1,
            "column": 23
          },
          "end": {
            "line": 1,
            "column": 42
          }
        },
        "extra": {
          "rawValue": "my_company_protos",
          "raw": "'my_company_protos'"
        },
        "value": "my_company_protos",
        "range": [23, 42],
        "_babelType": "StringLiteral",
        "raw": "'my_company_protos'"
      },
      "range": [0, 42],
      "_babelType": "ImportDeclaration"
    },
    {
      "type": "ExportNamedDeclaration",
      "start": 44,
      "end": 80,
      "loc": {
        "start": {
          "line": 3,
          "column": 0
        },
        "end": {
          "line": 3,
          "column": 36
        }
      },
      "specifiers": [],
      "source": null,
      "exportKind": "type",
      "declaration": {
        "type": "TypeAlias",
        "start": 51,
        "end": 80,
        "loc": {
          "start": {
            "line": 3,
            "column": 7
          },
          "end": {
            "line": 3,
            "column": 36
          }
        },
        "id": {
          "type": "Identifier",
          "start": 56,
          "end": 60,
          "loc": {
            "start": {
              "line": 3,
              "column": 12
            },
            "end": {
              "line": 3,
              "column": 16
            },
            "identifierName": "User"
          },
          "name": "User",
          "range": [56, 60],
          "_babelType": "Identifier"
        },
        "typeParameters": null,
        "right": {
          "type": "GenericTypeAnnotation",
          "start": 63,
          "end": 79,
          "loc": {
            "start": {
              "line": 3,
              "column": 19
            },
            "end": {
              "line": 3,
              "column": 35
            }
          },
          "typeParameters": null,
          "id": {
            "type": "QualifiedTypeIdentifier",
            "start": 63,
            "end": 79,
            "loc": {
              "start": {
                "line": 3,
                "column": 19
              },
              "end": {
                "line": 3,
                "column": 35
              }
            },
            "qualification": {
              "type": "QualifiedTypeIdentifier",
              "start": 63,
              "end": 74,
              "loc": {
                "start": {
                  "line": 3,
                  "column": 19
                },
                "end": {
                  "line": 3,
                  "column": 30
                }
              },
              "qualification": {
                "type": "Identifier",
                "start": 63,
                "end": 69,
                "loc": {
                  "start": {
                    "line": 3,
                    "column": 19
                  },
                  "end": {
                    "line": 3,
                    "column": 25
                  },
                  "identifierName": "protos"
                },
                "name": "protos",
                "range": [63, 69],
                "_babelType": "Identifier"
              },
              "range": [63, 74],
              "_babelType": "QualifiedTypeIdentifier"
            },
            "range": [63, 79],
            "_babelType": "QualifiedTypeIdentifier"
          },
          "range": [63, 79],
          "_babelType": "GenericTypeAnnotation"
        },
        "range": [51, 80],
        "_babelType": "TypeAlias"
      },
      "range": [44, 80],
      "_babelType": "ExportNamedDeclaration"
    }
  ]
}

The root of our AST (node type of Program) has two statements in its body, an ImportDeclaration and an ExportNamedDeclaration.

Looking at our ImportDeclaration first, there’s two properties we are interested in: source and specifiers. These nodes only include information about the source text. For example, the source value is my_company_protos. This gives me no information about whether this is a relative file path or referring to an external module, so that’s one thing I’d have to solve using the parser approach.

Similarly in our ExportNamedDeclaration, we’re given basic information about the source text. Namespaces complicate this structure and it can be arbitrarily nested, which adds more and more QualifiedTypeIdentifiers. This would be another awkward situation we’d need to solve if we continue down this path.

I haven’t even gotten to resolving types from imports yet! Given that a parser and AST is (by design) limited to information in its source text, we’d need to parse any imported files to have that information available in our final AST. But those imports could have their own imports!

It seems like a parser is pretty limited here in solving our problem without a lot of code, so let’s take a step back and think about the problem again.

We don’t want to deal with imports, we don’t want to care about file structure. We want to be able to resolve all of the properties of protos.user.User and inline them instead of relying on imports. How can we get at this type information to begin building this file?

Introduction to TypeScript‘s TypeChecker

Since we’ve decided a parser is insufficient to gather type information for imported interfaces, let’s review how the TypeScript compilation process works to see if we can infer where to look next.

One part immediately stands out here:

From a Program instance a TypeChecker can be created. TypeChecker is the core of the TypeScript type system. It is the part responsible for figuring out relationships between Symbols from different files, assigning Types to Symbols, and generating any semantic Diagnostics (i.e. errors).

The first thing a TypeChecker will do is to consolidate all the Symbols from different SourceFiles into a single view, and build a single Symbol Table by “merging” any common Symbols (e.g. namespaces spanning multiple files).

After initializing the original state, the TypeChecker is ready to answer any questions about the program. Such “questions” might be:

What is the Symbol for this Node?

What is the Type of this Symbol?

What Symbols are visible in this portion of the AST?

What are the available Signatures for a function declaration?

What errors should be reported for a file?

The TypeChecker sounds like exactly what we need! We want access to the underlying symbol table and API so we can answer those first two questions: What is the Symbol for this Node? And What is the Type of this Symbol? It even mentions dealing with merging common symbols, so it addresses our namespace problem we talked about earlier!

Soooo, how do we get at this API?

This is one of the few examples I could find online, but it’s enough to get us started. We can see that the checker can be accessed from a method on our Program instance. Looking at the usage in that example, we can see methods such as checker.getSymbolAtLocation and checker.getTypeOfSymbolAtLocation, which seems to be at least some variation of what we need.

Let’s start writing our program.

import { protos } from "./my_company_protos";

export type User = protos.user.User;
export namespace protos {
  export namespace user {
    export interface User {
      username: string;
      info: protos.Info.User;
    }
  }
  export namespace Info {
    export interface User {
      name: protos.Info.Name;
    }
    export interface Name {
      firstName: string;
      lastName: string;
    }
  }
}
import ts from "typescript";

// hardcode our input file
const filePath = "./src/models.ts";

// create a program instance, which is a collection of source files
// in this case we only have one source file
const program = ts.createProgram([filePath], {});

// pull off the typechecker instance from our program
const checker = program.getTypeChecker();

// get our models.ts source file AST
const source = program.getSourceFile(filePath);

// create TS printer instance which gives us utilities to pretty print our final AST
const printer = ts.createPrinter();

// helper to give us Node string type given kind
const syntaxToKind = (kind: ts.Node["kind"]) => {
  return ts.SyntaxKind[kind];
};
// visit each node in the root AST and log its kind
ts.forEachChild(source, (node) => {
  console.log(syntaxToKind(node.kind));
});
ts-node ./src/ts-alias.ts
# prints:
# ImportDeclaration
# TypeAliasDeclaration
# EndOfFileToken

We are concerned with the type alias declaration here, so let’s update our code to focus on that.

ts.forEachChild(source, (node) => {
  if (ts.isTypeAliasDeclaration(node)) {
    console.log(node.kind);
  }
});
// prints TypeAliasDeclaration

Now that we have our node, we want to go back to answering the two questions we mentioned earlier: What is the Symbol for this Node? And What is the Type of this Symbol?

ts.forEachChild(source, (node) => {
  if (ts.isTypeAliasDeclaration(node)) {
    const symbol = checker.getSymbolAtLocation(node.name);
    const type = checker.getDeclaredTypeOfSymbol(symbol);
    const properties = checker.getPropertiesOfType(type);
    properties.forEach((declaration) => {
      console.log(declaration.name);
      // prints username, info
    });
  }
});

So we’ve gotten at the names of type alias interface declaration by interacting with the TypeChecker’s symbol table. We still have a little ways to go, but this is a great starting point for the introspection side of things.

Let’s think about generation.

Transformation API We showed our goal earlier… given a TypeScript file, parse, introspect, and create a new TypeScript file. The function signature of AST -> AST is common in programming — enough that the TypeScript team released a custom transformation API to create your own!

Let’s write a simple custom transformer before we dive into our initial problem. Thank you to James Garbutt for giving me the boilerplate to start with.

Our first basic transformer will change numeric literals into string literals.

const source = `
  const two = 2;
  const four = 4;
`;

function numberTransformer<T extends ts.Node>(): ts.TransformerFactory<T> {
  return (context) => {
    const visit: ts.Visitor = (node) => {
      if (ts.isNumericLiteral(node)) {
        return ts.createStringLiteral(node.text);
      }
      return ts.visitEachChild(node, (child) => visit(child), context);
    };

    return (node) => ts.visitNode(node, visit);
  };
}

let result = ts.transpileModule(source, {
  compilerOptions: { module: ts.ModuleKind.CommonJS },
  transformers: { before: [numberTransformer()] },
});

console.log(result.outputText);

/*
  var two = "2";
  var four = "4";
*/

The most important interfaces to worry about here are Visitor and VisitorResult:

type Visitor = (node: Node) => VisitResult<Node>;
type VisitResult<T extends Node> = T | T[] | undefined;

Our goal as the author of a custom transformer is to write this Visitor. We are recursively visiting each node in our AST and returning a VisitResult, which may be one, many, or zero AST nodes. We can target specific nodes to modify while leaving the rest alone.

// input
export namespace protos {
  // ModuleDeclaration
  export namespace user {
    // ModuleDeclaration
    // Module Block
    export interface User {
      // InterfaceDeclaration
      username: string; // username: string is PropertySignature
      info: protos.Info.User; // TypeReference
    }
  }
  export namespace Info {
    export interface User {
      name: protos.Info.Name; // TypeReference
    }
    export interface Name {
      firstName: string;
      lastName: string;
    }
  }
}

// this line is a TypeAliasDeclaration
export type User = protos.user.User; // protos.user.User is a TypeReference

// output
export interface User {
  username: string;
  info: {
    // info: { .. } is a TypeLiteral
    name: {
      // name: { .. } is a TypeLiteral
      firstName: string;
      lastName: string;
    };
  };
}

Here is a labeled AST to show the nodes we will be working with.

Our visitor needs to handle two primary cases:

  1. Replace TypeAliasDeclarations with InterfaceDeclarations
  2. Resolve TypeReferences to TypeLiterals

Solution

Here is what that visitor code looks like with a minimal CLI:

import path from "path";
import ts from "typescript";
import _ from "lodash";
import fs from "fs";

const filePath = path.resolve(_.first(process.argv.slice(2)));

const program = ts.createProgram([filePath], {});
const checker = program.getTypeChecker();
const source = program.getSourceFile(filePath);
const printer = ts.createPrinter();

const typeAliasToInterfaceTransformer: ts.TransformerFactory<ts.SourceFile> = (
  context
) => {
  const visit: ts.Visitor = (node) => {
    node = ts.visitEachChild(node, visit, context);
    /*
      Convert type references to type literals
        interface IUser {
          username: string
        }
        type User = IUser <--- IUser is a type reference
        interface Context {
          user: User <--- User is a type reference
        }
      In both cases we want to convert the type reference to
      it's primitive literals. We want:
        interface IUser {
          username: string
        }
        type User = {
          username: string
        }
        interface Context {
          user: {
            username: string
          }
        }
    */
    if (ts.isTypeReferenceNode(node)) {
      const symbol = checker.getSymbolAtLocation(node.typeName);
      const type = checker.getDeclaredTypeOfSymbol(symbol);
      const declarations = _.flatMap(
        checker.getPropertiesOfType(type),
        (property) => {
          /*
          Type references declarations may themselves have type references, so we need
          to resolve those literals as well 
        */
          return _.map(property.declarations, visit);
        }
      );
      return ts.createTypeLiteralNode(declarations.filter(ts.isTypeElement));
    }

    /* 
      Convert type alias to interface declaration
        interface IUser {
          username: string
        }
        type User = IUser
    
      We want to remove all type aliases
        interface IUser {
          username: string
        }
        interface User {
          username: string  <-- Also need to resolve IUser
        }
    
    */

    if (ts.isTypeAliasDeclaration(node)) {
      const symbol = checker.getSymbolAtLocation(node.name);
      const type = checker.getDeclaredTypeOfSymbol(symbol);
      const declarations = _.flatMap(
        checker.getPropertiesOfType(type),
        (property) => {
          // Resolve type alias to it's literals
          return _.map(property.declarations, visit);
        }
      );

      // Create interface with fully resolved types
      return ts.createInterfaceDeclaration(
        [],
        [ts.createToken(ts.SyntaxKind.ExportKeyword)],
        node.name.getText(),
        [],
        [],
        declarations.filter(ts.isTypeElement)
      );
    }
    // Remove all export declarations
    if (ts.isImportDeclaration(node)) {
      return null;
    }

    return node;
  };

  return (node) => ts.visitNode(node, visit);
};

// Run source file through our transformer
const result = ts.transform(source, [typeAliasToInterfaceTransformer]);

// Create our output folder
const outputDir = path.resolve(__dirname, "../generated");
if (!fs.existsSync(outputDir)) {
  fs.mkdirSync(outputDir);
}

// Write pretty printed transformed typescript to output directory
fs.writeFileSync(
  path.resolve(__dirname, "../generated/models.ts"),
  printer.printFile(_.first(result.transformed))
);

I was really happy with how my solution turned out. It goes to show the power of good abstractions, intelligent compiler design, great developer tooling (VSCode autocomplete, AST explorer, etc) and a bit of outsourcing from other intelligent people’s experiences. The fully updated source code can be found here. I am not sure how useful this will be for anyone outside my narrow use case, but I mostly wanted to show off the power of the TypeScript compiler toolchain, as well as document my thought process to a unique problem I had not really solved before.

I hope this is helpful to anyone trying to do similar things. If you are intimidated by topics such as AST, compilers, and transforms, I hope I gave you enough boilerplate and links to other resources to get started. The code here is my final output after sitting down for extended periods of time learning. With Github private repos my first attempts at this, including all 45 // @ts-ignores and ! assertions in a 150 line file, can hide in the shadows of shame.