🧩 Parsing `.re` files

After getting the feel of Re4 right - just writing count++ without needing useState, .value, or setCount - I had to answer a bigger question:

How do I actually parse this thing?

I didn’t want wrappers like signal() or $state() - I wanted syntax like:

component Counter {
  state count = 0
  computed doubled = count * 2
}

Something the compiler could track easily and we could write naturally.

Which meant I needed to parse .re files with new keywords (state, component, computed, etc.) while still fully supporting JavaScript, TypeScript, and JSX.

First Attempt: Loose Tokenizing

I started off lazy. I didn’t write a full parser - just a loose tokenizer. It scanned for important keywords like state, component, etc. using an enum:

typescript

export enum TokenKind {
  Name,
  Component,
  State,
  Computed,
  Eq,
  LCurly,
  RCurly,
  JsKeyword,
  Eos,
  Unknown,
}

Whenever I saw one of my keywords, I paused and let a real JS parser take over:

component Counter {
  state count = 1
  prop name = "counter"
  return <div>{name} count: {1}</div>
}

When I hit state, I’d parse it manually:

typescript

function parseState(): Stmt[] {
  const stateToken = expect(TokenKind.State);

  const jsCode = eatJs();

  const src = `let ${jsCode}`;

  const program = parse(src, { jsx: true });

  const [decl, ...rest] = program.body;

  if (!isVariableDeclaration(decl)) raiseError();

  return [decl, ...rest];
}

The eatJs() function would just consume everything until the next keyword or end of block:

// Rough idea: consume JS tokens until we hit another re4 keyword
function eatJs() {
  let start = curToken.span.end;
  let last;
  while (isInComponentBlockScope() && !isRe4Keyword(token)) {
    last = nextToken();
  }
  const end = lastToken.end;
  return source.slice(start, end);
}

It worked! For a while.

JSX Broke Everything

Then JSX showed up. This broke everything:

tsx

<div state="1">hello</div>

My tokenizer saw state and thought it was a keyword. But it was just an attribute.

Worse:

<div>That's bad</div>

The ' was treated as a JS string start — but it was just JSX text.

Even:

tsx

<component></component>

Would trigger my component block parser, even though it was just a tag.

I kept patching:

Marking some tokens as Unknown
Skipping others
Trying to guess if I was inside JSX or JS

But it was clear: Loose tokenizing wasn’t going to scale.

Why I Thought It’d Be Easy

Honestly, it felt like it should be. component {} looks like try {} or function {} - I figured I could treat it like a block and move on.

But we need something more context-aware.

For example:

tsx

<div class="flex">hello</div>

Even though class is a keyword, here it is just an attribute name. Totally valid. Similar to await, which is only a keyword in async functions.

Loose parsing couldn’t tell when keywords were actually keywords.

I wasn’t building a tokenizer anymore. I was faking a parser. So I stopped.

Exploring Real Parsers

I looked into:

Oxc: Fast, great TS+JSX support, Rust-based
SWC: Similar, Rust-based
Babel: Heavy
Acorn: Tiny, readable, plugin-friendly

I loved Oxc, but I didn’t want to maintain a Rust fork just to parse a few keywords. So I circled back to Acorn.

🧠 Enter Acorn

Acorn is small, simple, and has a TypeScript plugin.

So my checklist:

JS ✅
TS ✅
JSX ✅

Now I just needed to support my syntax. So I wrote a plugin.

It added support for:

component, state, prop, computed
effect, mount, unmount

🔌 Acorn Plugins

An Acorn plugin is just a function that returns a class extending the base parser:

typescript

type Plugin = () => typeof acorn.Parser;

function re4Plugin() {
  return class extends acorn.Parser {
    // add logic here
  };
}

You can chain multiple plugins like this:

const MyParser = Parser.extend(tsPlugin, jsxPlugin, re4Plugin);
const ast = MyParser.parse('code', {
  /* options */
});

Plugin Composition Pain

Acorn plugins are chained like:

Parser.extend(tsPlugin, re4Plugin);

But internally they override methods like parseStatement(). Meaning only the last plugin wins.

Also, acorn-typescript overrides readWord() to detect TS keywords. If I override it for Re4, I lose TS support.

So I forked acorn-typescript. And added a hook system:

class TsParser extends Parser {
  readWordHooks: ((word: string) => TokenType | undefined)[] = [];


readWord() {
  const word = this.readWord1();

  // test with hooks
  for (const hook of this.readWordHooks) {
      const type = hook(word);
      if (type) {
        return this.finishToken(type, word);
      }
  }

  // .. original code
  let type = tt.name;

  if (this.keywords.test(word)) {
    type = jsTokens[word];
  } else if (new RegExp(tsKeywordsRegex).test(word)) {
    type = tsTokens[word];
  }

  return this.finishToken(type, word);
}

Now I can inject my keywords without breaking TS:

class Re4Parser extends Parser {
  constructor(...args: any[]) {
    super(...args);
    this.readWordHooks.push(readWordHook);
  }
}

function readWordHook(word: string) {
  if (re4Keywords.has(word)) return re4KeywordTokenTypes[word];
  return undefined;
}

Done. Acorn recognizes our tokens 🎉

Parsing Component Blocks

The heart of Re4 is the component block.

So I override parseStatement() to support it only at the top level:

typescript

if (isTopLevel() && isComponentKeyword()) {
  return this._parseComponent(this.startNode());
}

Now if it tries to parse statements inside a component block, it will parse state, prop, computed, mount, unmount, effect, etc.:

typescript

function parseStatement(...args) {
  if (isInComponentRootLevel()) {
    // same impl for prop and computed
    if (isContextual(token, re4Tokens.state)) {
      const node = this.startNode() satisfies Re4VariableDeclaration;
      node.reKind = 'state';
      return this.parseVarStatement(node, 'const'); // rest will be handled by Acorn
    }

    // Handle lifecycle blocks
    if (isLifeCycleBlockToken(token)) {
      return this._parseLifecycleBlock(this.startNode());
    }
  }

  return super.parseStatement(...args); // allow Acorn to handle the rest
}

Parse Component

typescript

function parseComponent(node: ComponentStatement) {
  this.next(); // consume 'component'
  node.id = this.parseIdent();

  // allow return keyword inside component blocks
  this.enterScope(AcornScopes.SCOPE_FUNCTION);

  this.context.push(componentContext);

  // Parse the component body
  node.body = this.parseBlock() as Re4BlockStatement;

  // Pop component context after parsing
  this.context.pop();
  this.exitScope();
  return this.finishNode(node, 'ComponentDeclaration');
}

Parse Lifecycle Blocks

For effect, mount, and unmount, I use the same trick:

typescript

function parseLifecycleBlock() {
  const node = this.startNode();
  this.next();
  node.kind = getLifeCycleNodeKind(token);
  node.body = this.parseBlock();
  return this.finishNode(node, 'LifecycleBlock');
}

Again, Acorn handles everything - I just route the keywords to the right behavior.

Also overrode:

typescript

// allow export component {}
shouldParseExportStatement() {
  return this.type === re4KwTokenTypes.component || super.shouldParseExportStatement();
}
// allow export default component {}
parseExportDefaultDeclaration() {
  if (this.type === re4KwTokenTypes.component) {
    return this._parseComponent(this.startNode());
  }
  return super.parseExportDefaultDeclaration();
}

And that was it. Fully working parser 🎉

Why Acorn Worked

No manual tokenization
Full control over scopes, blocks, and keywords
Clean AST for compilation
TS + JSX work thanks to acorn-typescript
No guessing, no edge cases, no hacks

TL;DR:

I built a parser for Re4 using Acorn with JSX + TS support. Tried a loose parsing strategy, failed with edge cases, then forked acorn-typescript and built a plugin to parse Re4 syntax.

🚀 Up Next

Parsing was step one. Next: the compiler Where count++ becomes a tracked signal. Where DOM updates happen without boilerplate.

No .value. No setCount. No boilerplate. Just JavaScript — supercharged.

If you're into: Compilers Framework internals UI reactivity experiments Feel free to follow along. I'll be posting updates as things evolve.

— Aadi (Follow On X)

🧩 Parsing .re files ​

First Attempt: Loose Tokenizing ​

JSX Broke Everything ​

Why I Thought It’d Be Easy ​

Exploring Real Parsers ​

🧠 Enter Acorn ​

🔌 Acorn Plugins ​

Plugin Composition Pain ​

Parsing Component Blocks ​

Parse Component ​

Parse Lifecycle Blocks ​

Why Acorn Worked ​

TL;DR: ​

🚀 Up Next ​

🧩 Parsing `.re` files

First Attempt: Loose Tokenizing

JSX Broke Everything

Why I Thought It’d Be Easy

Exploring Real Parsers

🧠 Enter Acorn

🔌 Acorn Plugins

Plugin Composition Pain

Parsing Component Blocks

Parse Component

Parse Lifecycle Blocks

Why Acorn Worked

TL;DR:

🚀 Up Next