Convert Figma logo to code with AI

Chevrotain logochevrotain

Parser Building Toolkit for JavaScript

2,517
208
2,517
42

Top Related Projects

A monadic LL(infinity) parser combinator library for javascript

4,839

PEG.js: Parser generator for JavaScript

17,097

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

5,018

A library and language for building parsers, interpreters, compilers, etc.

Parser Building Toolkit for JavaScript

Quick Overview

Chevrotain is a powerful parsing toolkit for JavaScript that allows developers to build efficient and feature-rich parsers. It provides a unique approach to parser construction, combining the ease of use of parser generators with the flexibility of hand-built recursive descent parsers.

Pros

  • High performance due to its optimized runtime
  • Excellent error recovery capabilities for robust parsing
  • Extensive documentation and examples for easy learning
  • Supports both ECMAScript and TypeScript

Cons

  • Steeper learning curve compared to some simpler parsing libraries
  • May be overkill for very simple parsing tasks
  • Limited support for left-recursive grammars
  • Requires manual lexer definition, which can be verbose for complex grammars

Code Examples

  1. Defining a simple lexer:
const { createToken, Lexer } = require('chevrotain');

const Integer = createToken({ name: "Integer", pattern: /[0-9]+/ });
const Plus = createToken({ name: "Plus", pattern: /\+/ });

const allTokens = [Integer, Plus];
const lexer = new Lexer(allTokens);
  1. Creating a basic parser:
const { CstParser } = require('chevrotain');

class Calculator extends CstParser {
    constructor() {
        super(allTokens);
        
        this.RULE("expression", () => {
            this.CONSUME(Integer);
            this.MANY(() => {
                this.CONSUME(Plus);
                this.CONSUME(Integer);
            });
        });

        this.performSelfAnalysis();
    }
}
  1. Parsing input:
const parser = new Calculator();
const lexingResult = lexer.tokenize("1 + 2 + 3");
parser.input = lexingResult.tokens;
const cst = parser.expression();

console.log(parser.errors.length === 0 ? "Parsing succeeded" : "Parsing failed");

Getting Started

To start using Chevrotain, first install it via npm:

npm install chevrotain

Then, in your JavaScript file:

const { createToken, Lexer, CstParser } = require('chevrotain');

// Define your tokens
const MyToken = createToken({ name: "MyToken", pattern: /[a-z]+/ });

// Create a lexer
const lexer = new Lexer([MyToken]);

// Define your parser
class MyParser extends CstParser {
    constructor() {
        super([MyToken]);
        this.RULE("myRule", () => {
            this.CONSUME(MyToken);
        });
        this.performSelfAnalysis();
    }
}

// Use the parser
const parser = new MyParser();
const lexingResult = lexer.tokenize("hello");
parser.input = lexingResult.tokens;
const cst = parser.myRule();

This basic setup allows you to start parsing simple inputs with Chevrotain.

Competitor Comparisons

A monadic LL(infinity) parser combinator library for javascript

Pros of Parsimmon

  • Simpler API and easier to learn for beginners
  • More lightweight and focused on parser combinators
  • Better suited for small to medium-sized parsing tasks

Cons of Parsimmon

  • Less performant for complex grammars or large inputs
  • Fewer advanced features and optimizations
  • Limited error reporting and recovery capabilities

Code Comparison

Parsimmon example:

const P = require('parsimmon');

const parser = P.string('hello')
  .then(P.string(' '))
  .then(P.string('world'));

console.log(parser.parse('hello world'));

Chevrotain example:

const { createToken, Lexer, CstParser } = require('chevrotain');

const Hello = createToken({ name: 'Hello', pattern: /hello/ });
const World = createToken({ name: 'World', pattern: /world/ });
const WhiteSpace = createToken({ name: 'WhiteSpace', pattern: /\s+/, group: Lexer.SKIPPED });

const allTokens = [Hello, World, WhiteSpace];
const lexer = new Lexer(allTokens);

class HelloWorldParser extends CstParser {
  constructor() {
    super(allTokens);
    this.RULE('expression', () => {
      this.CONSUME(Hello);
      this.CONSUME(World);
    });
    this.performSelfAnalysis();
  }
}

const parser = new HelloWorldParser();
const lexResult = lexer.tokenize('hello world');
parser.expression(lexResult.tokens);

Parsimmon is more concise for simple tasks, while Chevrotain offers more control and structure for complex grammars.

4,839

PEG.js: Parser generator for JavaScript

Pros of PEG.js

  • Simpler syntax for grammar definition
  • Built-in support for generating parser code in JavaScript
  • Easier to get started for beginners

Cons of PEG.js

  • Less flexible and customizable than Chevrotain
  • Limited support for error recovery and reporting
  • Slower parsing performance for complex grammars

Code Comparison

PEG.js grammar example:

start
  = additive

additive
  = left:multiplicative "+" right:additive { return left + right; }
  / multiplicative

multiplicative
  = left:primary "*" right:multiplicative { return left * right; }
  / primary

primary
  = integer
  / "(" additive:additive ")" { return additive; }

integer "integer"
  = digits:[0-9]+ { return parseInt(digits.join(""), 10); }

Chevrotain grammar example:

const { createToken, Lexer, CstParser } = require("chevrotain")

const Integer = createToken({ name: "Integer", pattern: /[0-9]+/ })
const Plus = createToken({ name: "Plus", pattern: /\+/ })
const Multiply = createToken({ name: "Multiply", pattern: /\*/ })
const LParen = createToken({ name: "LParen", pattern: /\(/ })
const RParen = createToken({ name: "RParen", pattern: /\)/ })

class Calculator extends CstParser {
  constructor() {
    super([Integer, Plus, Multiply, LParen, RParen])
    this.RULE("expression", () => {
      this.SUBRULE(this.additive)
    })
    // ... more rules defined here
  }
}
17,097

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

Pros of ANTLR

  • Mature and widely adopted parser generator with extensive documentation
  • Supports multiple target languages (Java, C#, Python, JavaScript, etc.)
  • Powerful grammar notation with built-in support for left-recursion

Cons of ANTLR

  • Steeper learning curve, especially for complex grammars
  • Generated parsers can be slower compared to hand-written ones
  • Less flexibility in customizing the parsing process

Code Comparison

ANTLR grammar example:

grammar Expression;
expr: term (('+' | '-') term)*;
term: factor (('*' | '/') factor)*;
factor: NUMBER | '(' expr ')';
NUMBER: [0-9]+;
WS: [ \t\r\n]+ -> skip;

Chevrotain grammar example:

const lexer = new Lexer([
  createToken({ name: "Number", pattern: /[0-9]+/ }),
  createToken({ name: "Plus", pattern: /\+/ }),
  createToken({ name: "Minus", pattern: /-/ }),
  createToken({ name: "LParen", pattern: /\(/ }),
  createToken({ name: "RParen", pattern: /\)/ }),
]);

class ExpressionParser extends CstParser {
  constructor() {
    super(lexer.tokenize(""));
    this.RULE("expression", () => {
      this.SUBRULE(this.term);
      this.MANY(() => {
        this.OR([
          { ALT: () => this.CONSUME(Plus) },
          { ALT: () => this.CONSUME(Minus) },
        ]);
        this.SUBRULE2(this.term);
      });
    });
    // ... more rules
  }
}
5,018

A library and language for building parsers, interpreters, compilers, etc.

Pros of Ohm

  • More declarative grammar syntax, making it easier to read and maintain
  • Built-in support for incremental parsing, which can improve performance for large inputs
  • Stronger emphasis on language design and prototyping

Cons of Ohm

  • Steeper learning curve due to its unique approach to grammar definition
  • Less flexibility in terms of lexer customization compared to Chevrotain
  • Smaller community and ecosystem

Code Comparison

Ohm grammar example:

Arithmetic {
  Exp     = AddExp
  AddExp  = AddExp "+" MulExp  -- plus
          | AddExp "-" MulExp  -- minus
          | MulExp
  MulExp  = MulExp "*" PriExp  -- times
          | MulExp "/" PriExp  -- divide
          | PriExp
  PriExp  = "(" Exp ")"        -- paren
          | number
  number  = digit+
}

Chevrotain grammar example:

const Arithmetic = createToken({ name: "Arithmetic", pattern: /arithmetic/ });
const Plus = createToken({ name: "Plus", pattern: /\+/ });
const Minus = createToken({ name: "Minus", pattern: /-/ });
const Mult = createToken({ name: "Mult", pattern: /\*/ });
const Div = createToken({ name: "Div", pattern: /\// });

Both Ohm and Chevrotain are powerful parser generators, but they cater to different use cases and preferences. Ohm focuses on a more declarative approach, while Chevrotain offers more flexibility and control over the parsing process.

Parser Building Toolkit for JavaScript

Pros of Chevrotain

  • Fast parsing performance
  • Extensive documentation and examples
  • Active community and regular updates

Cons of Chevrotain

  • Steeper learning curve for complex grammars
  • Limited built-in error recovery mechanisms

Code Comparison

Both repositories contain the same codebase, as they are the same project. Here's a sample of Chevrotain usage:

const { createToken, Lexer, CstParser } = require("chevrotain");

const Integer = createToken({ name: "Integer", pattern: /[0-9]+/ });
const Plus = createToken({ name: "Plus", pattern: /\+/ });

const allTokens = [Integer, Plus];
const CalculatorLexer = new Lexer(allTokens);

class CalculatorParser extends CstParser {
  constructor() {
    super(allTokens);
    this.RULE("expression", () => {
      this.CONSUME(Integer);
      this.MANY(() => {
        this.CONSUME(Plus);
        this.CONSUME(Integer);
      });
    });
    this.performSelfAnalysis();
  }
}

This code demonstrates how to create tokens, define a lexer, and implement a simple parser using Chevrotain.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Discussions npm npm Continuous Integration styled with prettier Commitizen friendly

Chevrotain

TLDR

Introduction

Chevrotain is a blazing fast and feature rich Parser Building Toolkit for JavaScript with built-in support for LL(K). Grammars and 3rd party plugin for LL(*) grammars. It can be used to build parsers/compilers/interpreters for various use cases ranging from simple configuration files, to full-fledged programing languages.

Grammars are written as pure JavaScript sources without a code generation phase,

A more in depth review of Chevrotain can be found in this great article on: Parsing in JavaScript: Tools and Libraries.

Installation

  • npm: npm install chevrotain
  • Browser ESM bundled versions: These can be downloaded directly via UNPKG or other NPM cdn services, e.g.:
    • Latest:
      • https://unpkg.com/chevrotain/lib/chevrotain.mjs
      • https://unpkg.com/chevrotain/lib/chevrotain.min.mjs
    • Explicit version number:
      • https://unpkg.com/chevrotain@11.0.3/lib/chevrotain.mjs
      • https://unpkg.com/chevrotain@11.0.3/lib/chevrotain.min.mjs

Documentation & Resources

Compatibility

Chevrotain will run on any modern JavaScript ES2015 runtime. That includes nodejs maintenance/active/current version, modern major browsers, but not legacy ES5.1 runtimes such as IE11.

Contributions

Contributions are greatly appreciated. See CONTRIBUTING.md for details.

Where used

A small-curated list:

  1. HyperFormula

    • HyperFormula is an open source, spreadsheet-like calculation engine
    • source
  2. Langium

    • Langium is a language engineering tool with built-in support for the Language Server Protocol.
  3. Prettier-Java

    • A Prettier Plugin for Java
    • source
  4. JHipster Domain Language

    • The JDL is a JHipster-specific domain language where you can describe all your applications, deployments, entities and their relationships in a single file (or more than one) with a user-friendly syntax.
    • source
  5. Argdown

    • Argdown is a simple syntax for analyzing complex argumentation.
    • source

NPM DownloadsLast 30 Days