Top Related Projects
A monadic LL(infinity) parser combinator library for javascript
PEG.js: Parser generator for JavaScript
A library and language for building parsers, interpreters, compilers, etc.
Parser Building Toolkit for JavaScript
Bison in JavaScript.
Quick Overview
Nearley is a powerful, fast, and flexible parsing toolkit for JavaScript. It allows you to define grammars in a simple, intuitive format and generates highly efficient parsers. Nearley is particularly well-suited for complex language parsing tasks and can handle ambiguous grammars.
Pros
- Easy-to-learn grammar syntax based on the Backus-Naur Form (BNF)
- Supports both browser and Node.js environments
- Generates fast and efficient parsers
- Handles ambiguous grammars and provides all possible parse trees
Cons
- Learning curve for complex grammars and advanced features
- Limited built-in error reporting and recovery mechanisms
- May be overkill for simple parsing tasks
- Documentation could be more comprehensive for advanced use cases
Code Examples
- Defining a simple arithmetic grammar:
@{%
const moo = require("moo");
const lexer = moo.compile({
number: /[0-9]+/,
plus: "+",
minus: "-",
times: "*",
divide: "/",
ws: /[ \t]+/
});
%}
@lexer lexer
expression -> term (_ addop _ term):*
term -> factor (_ mulop _ factor):*
factor -> number | "(" _ expression _ ")"
addop -> "+" | "-"
mulop -> "*" | "/"
number -> %number {% id %}
_ -> %ws:*
- Parsing a string using the generated parser:
const nearley = require("nearley");
const grammar = require("./arithmetic.js");
const parser = new nearley.Parser(nearley.Grammar.fromCompiled(grammar));
parser.feed("3 + 4 * (2 - 1)");
console.log(parser.results);
- Using a postprocessor to simplify the parse tree:
@{%
const postprocessor = {
expression: ([first, rest]) => {
return rest.reduce((acc, [, op, , term]) => ({
type: "binary",
operator: op,
left: acc,
right: term
}), first);
},
number: ([n]) => ({ type: "number", value: parseInt(n.value) })
};
%}
expression -> term (_ addop _ term):* {% postprocessor.expression %}
term -> factor (_ mulop _ factor):* {% postprocessor.expression %}
factor -> number {% id %} | "(" _ expression _ ")" {% ([,, expr]) => expr %}
number -> %number {% postprocessor.number %}
Getting Started
-
Install Nearley:
npm install nearley
-
Create a grammar file (e.g.,
mygrammar.ne
) and define your grammar rules. -
Compile the grammar:
npx nearleyc mygrammar.ne -o mygrammar.js
-
Use the generated parser in your JavaScript code:
const nearley = require("nearley"); const grammar = require("./mygrammar.js"); const parser = new nearley.Parser(nearley.Grammar.fromCompiled(grammar)); parser.feed("your input string"); console.log(parser.results);
Competitor Comparisons
A monadic LL(infinity) parser combinator library for javascript
Pros of Parsimmon
- More flexible and expressive, allowing for easier creation of complex parsers
- Better performance for certain types of grammars
- Easier to integrate with existing JavaScript code and workflows
Cons of Parsimmon
- Steeper learning curve for those unfamiliar with parser combinators
- Less suitable for generating parsers from formal grammar specifications
- May require more manual optimization for large-scale parsing tasks
Code Comparison
Nearley example:
main -> "hello" ws "world" {% function(d) { return d.join(""); } %}
ws -> [ \t\n\v\f]+
Parsimmon example:
const parser = Parsimmon.seq(
Parsimmon.string("hello"),
Parsimmon.regexp(/[ \t\n\v\f]+/),
Parsimmon.string("world")
).map(parts => parts.join(""));
Both Nearley and Parsimmon are popular parsing libraries for JavaScript, but they take different approaches. Nearley uses a more traditional grammar-based approach, while Parsimmon uses parser combinators. The choice between them depends on the specific requirements of your project, your familiarity with different parsing techniques, and the complexity of the language you're parsing.
PEG.js: Parser generator for JavaScript
Pros of PEG.js
- More mature and widely adopted project with a larger community
- Supports both browser and Node.js environments out of the box
- Offers an online playground for quick testing and experimentation
Cons of PEG.js
- Limited to PEG (Parsing Expression Grammar) syntax, which may be less flexible for certain use cases
- Can be slower for parsing large inputs compared to some alternatives
Code Comparison
PEG.js grammar example:
start
= additive
additive
= left:multiplicative "+" right:additive { return left + right; }
/ multiplicative
multiplicative
= left:primary "*" right:multiplicative { return left * right; }
/ primary
primary
= integer
/ "(" additive:additive ")" { return additive; }
integer "integer"
= digits:[0-9]+ { return parseInt(digits.join(""), 10); }
Nearley grammar example:
main -> AS {% id %}
AS -> AS "+" MD {% ([a, _, b]) => a + b %}
| MD {% id %}
MD -> MD "*" P {% ([a, _, b]) => a * b %}
| P {% id %}
P -> "(" AS ")" {% ([_, a, _]) => a %}
| N {% id %}
N -> [0-9]:+ {% ([digits]) => parseInt(digits.join("")) %}
Both examples demonstrate arithmetic expression parsing, showcasing the syntax differences between PEG.js and Nearley.
A library and language for building parsers, interpreters, compilers, etc.
Pros of Ohm
- More flexible and expressive grammar syntax
- Better support for incremental parsing and error recovery
- Extensive documentation and examples
Cons of Ohm
- Steeper learning curve for beginners
- Slightly slower parsing performance for some use cases
Code Comparison
Nearley grammar example:
main -> "hello" ws noun ws "!" {% ([,, noun]) => `Hello, ${noun}!` %}
noun -> "world" | "universe"
ws -> [ \t\n\v\f\r]*
Ohm grammar example:
Main {
greeting = "hello" ws noun ws "!"
noun = "world" | "universe"
ws = (" " | "\t" | "\n")*
}
Both Nearley and Ohm are powerful parser generators, but they have different approaches to grammar definition and parsing. Nearley uses a more traditional BNF-like syntax, while Ohm employs a custom grammar language that resembles extended BNF.
Ohm offers more flexibility in grammar definition and better support for incremental parsing, making it suitable for complex language processing tasks. However, this flexibility comes at the cost of a steeper learning curve for newcomers.
Nearley, on the other hand, is generally easier to get started with and may offer better performance for simpler grammars. It also provides built-in support for ambiguous grammars, which can be useful in certain scenarios.
Ultimately, the choice between Nearley and Ohm depends on the specific requirements of your project and your familiarity with parser generators.
Parser Building Toolkit for JavaScript
Pros of Chevrotain
- Better performance, especially for large inputs
- More flexible and customizable parsing options
- Extensive documentation and examples
Cons of Chevrotain
- Steeper learning curve
- More verbose syntax for grammar definition
- Requires more boilerplate code
Code Comparison
Nearley grammar example:
expression -> number "+" number {% ([a, _, b]) => a + b %}
number -> [0-9]:+ {% ([digits]) => parseInt(digits.join("")) %}
Chevrotain grammar example:
const Expression = createToken({ name: "Expression", pattern: Lexer.NA });
const Plus = createToken({ name: "Plus", pattern: /\+/ });
const Number = createToken({ name: "Number", pattern: /[0-9]+/ });
class Calculator extends CstParser {
constructor() {
super([Expression, Plus, Number]);
this.RULE("expression", () => {
this.CONSUME(Number);
this.CONSUME(Plus);
this.CONSUME(Number);
});
}
}
Nearley uses a more concise syntax for grammar definition, while Chevrotain requires more setup but offers greater flexibility and control over the parsing process. Chevrotain's approach may be more familiar to developers used to object-oriented programming, while Nearley's syntax is closer to traditional BNF notation.
Bison in JavaScript.
Pros of Jison
- More mature and widely adopted project with a larger community
- Supports both LR and LALR parsing algorithms
- Offers a command-line interface for easier integration into build processes
Cons of Jison
- Less active development in recent years
- Steeper learning curve for beginners
- Limited support for streaming input
Code Comparison
Jison:
%lex
%%
\s+ /* skip whitespace */
[0-9]+("."[0-9]+)? return 'NUMBER'
"+" return '+'
"-" return '-'
"*" return '*'
"/" return '/'
Nearley:
@{%
const moo = require("moo");
const lexer = moo.compile({
ws: /[ \t]+/,
number: /[0-9]+(?:\.[0-9]+)?/,
plus: '+',
minus: '-',
times: '*',
divide: '/'
});
%}
@lexer lexer
Both Jison and Nearley are popular parser generators for JavaScript. Jison is more established and offers multiple parsing algorithms, while Nearley is newer and focuses on ease of use. Jison's syntax is more similar to Bison, while Nearley uses a custom grammar format with JavaScript integration. The code examples show how lexical rules are defined in each system, with Jison using a more compact syntax and Nearley leveraging the Moo lexer library for tokenization.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
nearley âï¸
nearley is a simple, fast and powerful parsing toolkit. It consists of:
- A powerful, modular DSL for describing languages
- An efficient, lightweight Earley parser
- Loads of tools, editor plug-ins, and other goodies!
nearley is a streaming parser with support for catching errors gracefully and providing all parsings for ambiguous grammars. It is compatible with a variety of lexers (we recommend moo). It comes with tools for creating tests, railroad diagrams and fuzzers from your grammars, and has support for a variety of editors and platforms. It works in both node and the browser.
Unlike most other parser generators, nearley can handle any grammar you can define in BNF (and more!). In particular, while most existing JS parsers such as PEGjs and Jison choke on certain grammars (e.g. left recursive ones), nearley handles them easily and efficiently by using the Earley parsing algorithm.
nearley is used by a wide variety of projects:
- artificial intelligence and
- computational linguistics classes at universities;
- file format parsers;
- data-driven markup languages;
- compilers for real-world programming languages;
- and nearley itself! The nearley compiler is bootstrapped.
nearley is an npm staff pick.
Documentation
Please visit our website https://nearley.js.org to get started! You will find a tutorial, detailed reference documents, and links to several real-world examples to get inspired.
Contributing
Please read this document before working on nearley. If you are interested in contributing but unsure where to start, take a look at the issues labeled "up for grabs" on the issue tracker, or message a maintainer (@kach or @tjvr on Github).
nearley is MIT licensed.
A big thanks to Nathan Dinsmore for teaching me how to Earley, Aria Stewart for helping structure nearley into a mature module, and Robin Windels for bootstrapping the grammar. Additionally, Jacob Edelman wrote an experimental JavaScript parser with nearley and contributed ideas for EBNF support. Joshua T. Corbin refactored the compiler to be much, much prettier. Bojidar Marinov implemented postprocessors-in-other-languages. Shachar Itzhaky fixed a subtle bug with nullables.
Citing nearley
If you are citing nearley in academic work, please use the following BibTeX entry.
@misc{nearley,
author = "Kartik Chandra and Tim Radvan",
title = "{nearley}: a parsing toolkit for {JavaScript}",
year = {2014},
doi = {10.5281/zenodo.3897993},
url = {https://github.com/kach/nearley}
}
Top Related Projects
A monadic LL(infinity) parser combinator library for javascript
PEG.js: Parser generator for JavaScript
A library and language for building parsers, interpreters, compilers, etc.
Parser Building Toolkit for JavaScript
Bison in JavaScript.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot