nearley

📜🔜🌲 Simple, fast, powerful parser toolkit for JavaScript.

3,696

234

3,696

198

View on GitHub View on NPM

Top Related Projects

parsimmon

1,256

A monadic LL(infinity) parser combinator library for javascript

pegjs

4,885

PEG.js: Parser generator for JavaScript

ohm

5,324

A library and language for building parsers, interpreters, compilers, etc.

chevrotain

2,632

Parser Building Toolkit for JavaScript

Quick Overview

Nearley is a powerful, fast, and flexible parsing toolkit for JavaScript. It allows you to define grammars in a simple, intuitive format and generates highly efficient parsers. Nearley is particularly well-suited for complex language parsing tasks and can handle ambiguous grammars.

Pros

Easy-to-learn grammar syntax based on the Backus-Naur Form (BNF)
Supports both browser and Node.js environments
Generates fast and efficient parsers
Handles ambiguous grammars and provides all possible parse trees

Cons

Learning curve for complex grammars and advanced features
Limited built-in error reporting and recovery mechanisms
May be overkill for simple parsing tasks
Documentation could be more comprehensive for advanced use cases

Code Examples

Defining a simple arithmetic grammar:

@{%
const moo = require("moo");

const lexer = moo.compile({
  number: /[0-9]+/,
  plus: "+",
  minus: "-",
  times: "*",
  divide: "/",
  ws: /[ \t]+/
});
%}

@lexer lexer

expression -> term (_ addop _ term):*
term -> factor (_ mulop _ factor):*
factor -> number | "(" _ expression _ ")"

addop -> "+" | "-"
mulop -> "*" | "/"

number -> %number {% id %}
_ -> %ws:*

Parsing a string using the generated parser:

const nearley = require("nearley");
const grammar = require("./arithmetic.js");

const parser = new nearley.Parser(nearley.Grammar.fromCompiled(grammar));

parser.feed("3 + 4 * (2 - 1)");
console.log(parser.results);

Using a postprocessor to simplify the parse tree:

@{%
const postprocessor = {
  expression: ([first, rest]) => {
    return rest.reduce((acc, [, op, , term]) => ({
      type: "binary",
      operator: op,
      left: acc,
      right: term
    }), first);
  },
  number: ([n]) => ({ type: "number", value: parseInt(n.value) })
};
%}

expression -> term (_ addop _ term):* {% postprocessor.expression %}
term -> factor (_ mulop _ factor):* {% postprocessor.expression %}
factor -> number {% id %} | "(" _ expression _ ")" {% ([,, expr]) => expr %}

number -> %number {% postprocessor.number %}

Getting Started

Install Nearley:
```
npm install nearley
```
Create a grammar file (e.g., mygrammar.ne) and define your grammar rules.

Compile the grammar:

npx nearleyc mygrammar.ne -o mygrammar.js

Use the generated parser in your JavaScript code:

const nearley = require("nearley");
const grammar = require("./mygrammar.js");

const parser = new nearley.Parser(nearley.Grammar.fromCompiled(grammar));
parser.feed("your input string");
console.log(parser.results);

Competitor Comparisons

parsimmon

1,256

A monadic LL(infinity) parser combinator library for javascript

Pros of Parsimmon

More flexible and expressive, allowing for easier creation of complex parsers
Better performance for certain types of grammars
Easier to integrate with existing JavaScript code and workflows

Cons of Parsimmon

Steeper learning curve for those unfamiliar with parser combinators
Less suitable for generating parsers from formal grammar specifications
May require more manual optimization for large-scale parsing tasks

Code Comparison

Nearley example:

main -> "hello" ws "world" {% function(d) { return d.join(""); } %}
ws -> [ \t\n\v\f]+

Parsimmon example:

const parser = Parsimmon.seq(
  Parsimmon.string("hello"),
  Parsimmon.regexp(/[ \t\n\v\f]+/),
  Parsimmon.string("world")
).map(parts => parts.join(""));

Both Nearley and Parsimmon are popular parsing libraries for JavaScript, but they take different approaches. Nearley uses a more traditional grammar-based approach, while Parsimmon uses parser combinators. The choice between them depends on the specific requirements of your project, your familiarity with different parsing techniques, and the complexity of the language you're parsing.

pegjs

4,885

PEG.js: Parser generator for JavaScript

Pros of PEG.js

More mature and widely adopted project with a larger community
Supports both browser and Node.js environments out of the box
Offers an online playground for quick testing and experimentation

Cons of PEG.js

Limited to PEG (Parsing Expression Grammar) syntax, which may be less flexible for certain use cases
Can be slower for parsing large inputs compared to some alternatives

Code Comparison

PEG.js grammar example:

start
  = additive

additive
  = left:multiplicative "+" right:additive { return left + right; }
  / multiplicative

multiplicative
  = left:primary "*" right:multiplicative { return left * right; }
  / primary

primary
  = integer
  / "(" additive:additive ")" { return additive; }

integer "integer"
  = digits:[0-9]+ { return parseInt(digits.join(""), 10); }

Nearley grammar example:

main -> AS {% id %}

AS -> AS "+" MD {% ([a, _, b]) => a + b %}
    | MD {% id %}

MD -> MD "*" P  {% ([a, _, b]) => a * b %}
    | P  {% id %}

P -> "(" AS ")" {% ([_, a, _]) => a %}
   | N {% id %}

N -> [0-9]:+ {% ([digits]) => parseInt(digits.join("")) %}

Both examples demonstrate arithmetic expression parsing, showcasing the syntax differences between PEG.js and Nearley.

ohm

5,324

A library and language for building parsers, interpreters, compilers, etc.

Pros of Ohm

More flexible and expressive grammar syntax
Better support for incremental parsing and error recovery
Extensive documentation and examples

Cons of Ohm

Steeper learning curve for beginners
Slightly slower parsing performance for some use cases

Code Comparison

Nearley grammar example:

main -> "hello" ws noun ws "!" {% ([,, noun]) => `Hello, ${noun}!` %}
noun -> "world" | "universe"
ws -> [ \t\n\v\f\r]*

Ohm grammar example:

Main {
  greeting = "hello" ws noun ws "!"
  noun = "world" | "universe"
  ws = (" " | "\t" | "\n")*
}

Both Nearley and Ohm are powerful parser generators, but they have different approaches to grammar definition and parsing. Nearley uses a more traditional BNF-like syntax, while Ohm employs a custom grammar language that resembles extended BNF.

Ohm offers more flexibility in grammar definition and better support for incremental parsing, making it suitable for complex language processing tasks. However, this flexibility comes at the cost of a steeper learning curve for newcomers.

Nearley, on the other hand, is generally easier to get started with and may offer better performance for simpler grammars. It also provides built-in support for ambiguous grammars, which can be useful in certain scenarios.

Ultimately, the choice between Nearley and Ohm depends on the specific requirements of your project and your familiarity with parser generators.

chevrotain

2,632

Parser Building Toolkit for JavaScript

Pros of Chevrotain

Better performance, especially for large inputs
More flexible and customizable parsing options
Extensive documentation and examples

Cons of Chevrotain

Steeper learning curve
More verbose syntax for grammar definition
Requires more boilerplate code

Code Comparison

Nearley grammar example:

expression -> number "+" number {% ([a, _, b]) => a + b %}
number -> [0-9]:+ {% ([digits]) => parseInt(digits.join("")) %}

Chevrotain grammar example:

const Expression = createToken({ name: "Expression", pattern: Lexer.NA });
const Plus = createToken({ name: "Plus", pattern: /\+/ });
const Number = createToken({ name: "Number", pattern: /[0-9]+/ });

class Calculator extends CstParser {
  constructor() {
    super([Expression, Plus, Number]);
    this.RULE("expression", () => {
      this.CONSUME(Number);
      this.CONSUME(Plus);
      this.CONSUME(Number);
    });
  }
}

Nearley uses a more concise syntax for grammar definition, while Chevrotain requires more setup but offers greater flexibility and control over the parsing process. Chevrotain's approach may be more familiar to developers used to object-oriented programming, while Nearley's syntax is closer to traditional BNF notation.

jison

4,377

Bison in JavaScript.

Pros of Jison

More mature and widely adopted project with a larger community
Supports both LR and LALR parsing algorithms
Offers a command-line interface for easier integration into build processes

Cons of Jison

Less active development in recent years
Steeper learning curve for beginners
Limited support for streaming input

Code Comparison

Jison:

%lex
%%
\s+                   /* skip whitespace */
[0-9]+("."[0-9]+)?    return 'NUMBER'
"+"                   return '+'
"-"                   return '-'
"*"                   return '*'
"/"                   return '/'

Nearley:

@{%
const moo = require("moo");
const lexer = moo.compile({
  ws:     /[ \t]+/,
  number: /[0-9]+(?:\.[0-9]+)?/,
  plus:   '+',
  minus:  '-',
  times:  '*',
  divide: '/'
});
%}

@lexer lexer

Both Jison and Nearley are popular parser generators for JavaScript. Jison is more established and offers multiple parsing algorithms, while Nearley is newer and focuses on ease of use. Jison's syntax is more similar to Bison, while Nearley uses a custom grammar format with JavaScript integration. The code examples show how lexical rules are defined in each system, with Jison using a more compact syntax and Nearley leveraging the Moo lexer library for tokenization.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

nearley âï¸

nearley is a simple, fast and powerful parsing toolkit. It consists of:

nearley is a streaming parser with support for catching errors gracefully and providing all parsings for ambiguous grammars. It is compatible with a variety of lexers (we recommend moo). It comes with tools for creating tests, railroad diagrams and fuzzers from your grammars, and has support for a variety of editors and platforms. It works in both node and the browser.

Unlike most other parser generators, nearley can handle any grammar you can define in BNF (and more!). In particular, while most existing JS parsers such as PEGjs and Jison choke on certain grammars (e.g. left recursive ones), nearley handles them easily and efficiently by using the Earley parsing algorithm.

nearley is used by a wide variety of projects:

artificial intelligence and
computational linguistics classes at universities;
file format parsers;
data-driven markup languages;
compilers for real-world programming languages;
and nearley itself! The nearley compiler is bootstrapped.

nearley is an npm staff pick.

Documentation

Please visit our website https://nearley.js.org to get started! You will find a tutorial, detailed reference documents, and links to several real-world examples to get inspired.

Contributing

Please read this document before working on nearley. If you are interested in contributing but unsure where to start, take a look at the issues labeled "up for grabs" on the issue tracker, or message a maintainer (@kach or @tjvr on Github).

nearley is MIT licensed.

A big thanks to Nathan Dinsmore for teaching me how to Earley, Aria Stewart for helping structure nearley into a mature module, and Robin Windels for bootstrapping the grammar. Additionally, Jacob Edelman wrote an experimental JavaScript parser with nearley and contributed ideas for EBNF support. Joshua T. Corbin refactored the compiler to be much, much prettier. Bojidar Marinov implemented postprocessors-in-other-languages. Shachar Itzhaky fixed a subtle bug with nullables.

Citing nearley

If you are citing nearley in academic work, please use the following BibTeX entry.

@misc{nearley,
    author = "Kartik Chandra and Tim Radvan",
    title  = "{nearley}: a parsing toolkit for {JavaScript}",
    year   = {2014},
    doi    = {10.5281/zenodo.3897993},
    url    = {https://github.com/kach/nearley}
}

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

Top Related Projects

Quick Overview

Pros

Cons

Code Examples

Getting Started

Competitor Comparisons

Pros of Parsimmon

Cons of Parsimmon

Code Comparison

Pros of PEG.js

Cons of PEG.js

Code Comparison

Pros of Ohm

Cons of Ohm

Code Comparison

Pros of Chevrotain

Cons of Chevrotain

Code Comparison

Pros of Jison

Cons of Jison

Code Comparison

Convert designs to code with AI

README

nearley âï¸

Documentation

Contributing

Citing nearley

Top Related Projects

Convert designs to code with AI

NPM DownloadsLast 30 Days

nearley âï¸