Top Related Projects
A monadic LL(infinity) parser combinator library for javascript
PEG.js: Parser generator for JavaScript
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
A library and language for building parsers, interpreters, compilers, etc.
Parser Building Toolkit for JavaScript
Quick Overview
Chevrotain is a powerful parsing toolkit for JavaScript that allows developers to build efficient and feature-rich parsers. It provides a unique approach to parser construction, combining the ease of use of parser generators with the flexibility of hand-built recursive descent parsers.
Pros
- High performance due to its optimized runtime
- Excellent error recovery capabilities for robust parsing
- Extensive documentation and examples for easy learning
- Supports both ECMAScript and TypeScript
Cons
- Steeper learning curve compared to some simpler parsing libraries
- May be overkill for very simple parsing tasks
- Limited support for left-recursive grammars
- Requires manual lexer definition, which can be verbose for complex grammars
Code Examples
- Defining a simple lexer:
const { createToken, Lexer } = require('chevrotain');
const Integer = createToken({ name: "Integer", pattern: /[0-9]+/ });
const Plus = createToken({ name: "Plus", pattern: /\+/ });
const allTokens = [Integer, Plus];
const lexer = new Lexer(allTokens);
- Creating a basic parser:
const { CstParser } = require('chevrotain');
class Calculator extends CstParser {
constructor() {
super(allTokens);
this.RULE("expression", () => {
this.CONSUME(Integer);
this.MANY(() => {
this.CONSUME(Plus);
this.CONSUME(Integer);
});
});
this.performSelfAnalysis();
}
}
- Parsing input:
const parser = new Calculator();
const lexingResult = lexer.tokenize("1 + 2 + 3");
parser.input = lexingResult.tokens;
const cst = parser.expression();
console.log(parser.errors.length === 0 ? "Parsing succeeded" : "Parsing failed");
Getting Started
To start using Chevrotain, first install it via npm:
npm install chevrotain
Then, in your JavaScript file:
const { createToken, Lexer, CstParser } = require('chevrotain');
// Define your tokens
const MyToken = createToken({ name: "MyToken", pattern: /[a-z]+/ });
// Create a lexer
const lexer = new Lexer([MyToken]);
// Define your parser
class MyParser extends CstParser {
constructor() {
super([MyToken]);
this.RULE("myRule", () => {
this.CONSUME(MyToken);
});
this.performSelfAnalysis();
}
}
// Use the parser
const parser = new MyParser();
const lexingResult = lexer.tokenize("hello");
parser.input = lexingResult.tokens;
const cst = parser.myRule();
This basic setup allows you to start parsing simple inputs with Chevrotain.
Competitor Comparisons
A monadic LL(infinity) parser combinator library for javascript
Pros of Parsimmon
- Simpler API and easier to learn for beginners
- More lightweight and focused on parser combinators
- Better suited for small to medium-sized parsing tasks
Cons of Parsimmon
- Less performant for complex grammars or large inputs
- Fewer advanced features and optimizations
- Limited error reporting and recovery capabilities
Code Comparison
Parsimmon example:
const P = require('parsimmon');
const parser = P.string('hello')
.then(P.string(' '))
.then(P.string('world'));
console.log(parser.parse('hello world'));
Chevrotain example:
const { createToken, Lexer, CstParser } = require('chevrotain');
const Hello = createToken({ name: 'Hello', pattern: /hello/ });
const World = createToken({ name: 'World', pattern: /world/ });
const WhiteSpace = createToken({ name: 'WhiteSpace', pattern: /\s+/, group: Lexer.SKIPPED });
const allTokens = [Hello, World, WhiteSpace];
const lexer = new Lexer(allTokens);
class HelloWorldParser extends CstParser {
constructor() {
super(allTokens);
this.RULE('expression', () => {
this.CONSUME(Hello);
this.CONSUME(World);
});
this.performSelfAnalysis();
}
}
const parser = new HelloWorldParser();
const lexResult = lexer.tokenize('hello world');
parser.expression(lexResult.tokens);
Parsimmon is more concise for simple tasks, while Chevrotain offers more control and structure for complex grammars.
PEG.js: Parser generator for JavaScript
Pros of PEG.js
- Simpler syntax for grammar definition
- Built-in support for generating parser code in JavaScript
- Easier to get started for beginners
Cons of PEG.js
- Less flexible and customizable than Chevrotain
- Limited support for error recovery and reporting
- Slower parsing performance for complex grammars
Code Comparison
PEG.js grammar example:
start
= additive
additive
= left:multiplicative "+" right:additive { return left + right; }
/ multiplicative
multiplicative
= left:primary "*" right:multiplicative { return left * right; }
/ primary
primary
= integer
/ "(" additive:additive ")" { return additive; }
integer "integer"
= digits:[0-9]+ { return parseInt(digits.join(""), 10); }
Chevrotain grammar example:
const { createToken, Lexer, CstParser } = require("chevrotain")
const Integer = createToken({ name: "Integer", pattern: /[0-9]+/ })
const Plus = createToken({ name: "Plus", pattern: /\+/ })
const Multiply = createToken({ name: "Multiply", pattern: /\*/ })
const LParen = createToken({ name: "LParen", pattern: /\(/ })
const RParen = createToken({ name: "RParen", pattern: /\)/ })
class Calculator extends CstParser {
constructor() {
super([Integer, Plus, Multiply, LParen, RParen])
this.RULE("expression", () => {
this.SUBRULE(this.additive)
})
// ... more rules defined here
}
}
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
Pros of ANTLR
- Mature and widely adopted parser generator with extensive documentation
- Supports multiple target languages (Java, C#, Python, JavaScript, etc.)
- Powerful grammar notation with built-in support for left-recursion
Cons of ANTLR
- Steeper learning curve, especially for complex grammars
- Generated parsers can be slower compared to hand-written ones
- Less flexibility in customizing the parsing process
Code Comparison
ANTLR grammar example:
grammar Expression;
expr: term (('+' | '-') term)*;
term: factor (('*' | '/') factor)*;
factor: NUMBER | '(' expr ')';
NUMBER: [0-9]+;
WS: [ \t\r\n]+ -> skip;
Chevrotain grammar example:
const lexer = new Lexer([
createToken({ name: "Number", pattern: /[0-9]+/ }),
createToken({ name: "Plus", pattern: /\+/ }),
createToken({ name: "Minus", pattern: /-/ }),
createToken({ name: "LParen", pattern: /\(/ }),
createToken({ name: "RParen", pattern: /\)/ }),
]);
class ExpressionParser extends CstParser {
constructor() {
super(lexer.tokenize(""));
this.RULE("expression", () => {
this.SUBRULE(this.term);
this.MANY(() => {
this.OR([
{ ALT: () => this.CONSUME(Plus) },
{ ALT: () => this.CONSUME(Minus) },
]);
this.SUBRULE2(this.term);
});
});
// ... more rules
}
}
A library and language for building parsers, interpreters, compilers, etc.
Pros of Ohm
- More declarative grammar syntax, making it easier to read and maintain
- Built-in support for incremental parsing, which can improve performance for large inputs
- Stronger emphasis on language design and prototyping
Cons of Ohm
- Steeper learning curve due to its unique approach to grammar definition
- Less flexibility in terms of lexer customization compared to Chevrotain
- Smaller community and ecosystem
Code Comparison
Ohm grammar example:
Arithmetic {
Exp = AddExp
AddExp = AddExp "+" MulExp -- plus
| AddExp "-" MulExp -- minus
| MulExp
MulExp = MulExp "*" PriExp -- times
| MulExp "/" PriExp -- divide
| PriExp
PriExp = "(" Exp ")" -- paren
| number
number = digit+
}
Chevrotain grammar example:
const Arithmetic = createToken({ name: "Arithmetic", pattern: /arithmetic/ });
const Plus = createToken({ name: "Plus", pattern: /\+/ });
const Minus = createToken({ name: "Minus", pattern: /-/ });
const Mult = createToken({ name: "Mult", pattern: /\*/ });
const Div = createToken({ name: "Div", pattern: /\// });
Both Ohm and Chevrotain are powerful parser generators, but they cater to different use cases and preferences. Ohm focuses on a more declarative approach, while Chevrotain offers more flexibility and control over the parsing process.
Parser Building Toolkit for JavaScript
Pros of Chevrotain
- Fast parsing performance
- Extensive documentation and examples
- Active community and regular updates
Cons of Chevrotain
- Steeper learning curve for complex grammars
- Limited built-in error recovery mechanisms
Code Comparison
Both repositories contain the same codebase, as they are the same project. Here's a sample of Chevrotain usage:
const { createToken, Lexer, CstParser } = require("chevrotain");
const Integer = createToken({ name: "Integer", pattern: /[0-9]+/ });
const Plus = createToken({ name: "Plus", pattern: /\+/ });
const allTokens = [Integer, Plus];
const CalculatorLexer = new Lexer(allTokens);
class CalculatorParser extends CstParser {
constructor() {
super(allTokens);
this.RULE("expression", () => {
this.CONSUME(Integer);
this.MANY(() => {
this.CONSUME(Plus);
this.CONSUME(Integer);
});
});
this.performSelfAnalysis();
}
}
This code demonstrates how to create tokens, define a lexer, and implement a simple parser using Chevrotain.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Chevrotain
TLDR
- Online Playground
- Getting Started Tutorial
- YouTube Video: Introduction to Lexers, Parsers and Interpreters with Chevrotain
- Performance benchmark
Introduction
Chevrotain is a blazing fast and feature rich Parser Building Toolkit for JavaScript with built-in support for LL(K). Grammars and 3rd party plugin for LL(*) grammars. It can be used to build parsers/compilers/interpreters for various use cases ranging from simple configuration files, to full-fledged programing languages.
Grammars are written as pure JavaScript sources without a code generation phase,
A more in depth review of Chevrotain can be found in this great article on: Parsing in JavaScript: Tools and Libraries.
Installation
- npm:
npm install chevrotain
- Browser ESM bundled versions:
These can be downloaded directly via UNPKG or other NPM cdn services, e.g.:
- Latest:
https://unpkg.com/chevrotain/lib/chevrotain.mjs
https://unpkg.com/chevrotain/lib/chevrotain.min.mjs
- Explicit version number:
https://unpkg.com/chevrotain@11.0.3/lib/chevrotain.mjs
https://unpkg.com/chevrotain@11.0.3/lib/chevrotain.min.mjs
- Latest:
Documentation & Resources
-
FAQ.
Compatibility
Chevrotain will run on any modern JavaScript ES2015 runtime. That includes nodejs maintenance/active/current version, modern major browsers, but not legacy ES5.1 runtimes such as IE11.
Contributions
Contributions are greatly appreciated. See CONTRIBUTING.md for details.
Where used
A small-curated list:
-
- HyperFormula is an open source, spreadsheet-like calculation engine
- source
-
- Langium is a language engineering tool with built-in support for the Language Server Protocol.
-
- A Prettier Plugin for Java
- source
-
- The JDL is a JHipster-specific domain language where you can describe all your applications, deployments, entities and their relationships in a single file (or more than one) with a user-friendly syntax.
- source
-
- Argdown is a simple syntax for analyzing complex argumentation.
- source
Top Related Projects
A monadic LL(infinity) parser combinator library for javascript
PEG.js: Parser generator for JavaScript
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
A library and language for building parsers, interpreters, compilers, etc.
Parser Building Toolkit for JavaScript
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot