Convert Figma logo to code with AI

kevinmehall logorust-peg

Parsing Expression Grammar (PEG) parser generator for Rust

1,483
107
1,483
39

Top Related Projects

4,698

The Elegant Parser

3,106

LR(1) parser generator for Rust

9,568

Rust parser combinator framework

Quick Overview

rust-peg is a Parsing Expression Grammar (PEG) parser generator for Rust. It allows developers to define grammars using a simple syntax and generates efficient Rust code for parsing input based on those grammars. This library simplifies the process of creating parsers for complex text formats or domain-specific languages.

Pros

  • Easy-to-use syntax for defining grammars
  • Generates efficient Rust code for parsing
  • Integrates well with Rust's macro system
  • Supports error reporting and recovery

Cons

  • Limited documentation and examples
  • May have a steeper learning curve for those unfamiliar with PEG parsers
  • Performance can be slower compared to hand-written parsers for simple grammars

Code Examples

  1. Basic arithmetic expression parser:
peg::parser!{
    grammar arithmetic_parser() for str {
        rule number() -> i32
            = n:$(['0'..='9']+) { n.parse().unwrap() }

        pub rule expression() -> i32
            = l:term() "+" r:expression() { l + r }
            / l:term() "-" r:expression() { l - r }
            / term()

        rule term() -> i32
            = l:factor() "*" r:term() { l * r }
            / l:factor() "/" r:term() { l / r }
            / factor()

        rule factor() -> i32
            = "(" e:expression() ")" { e }
            / number()
    }
}

fn main() {
    println!("{}", arithmetic_parser::expression("2 + 3 * 4").unwrap());
}
  1. Simple JSON parser:
peg::parser!{
    grammar json_parser() for str {
        rule value() -> serde_json::Value
            = object()
            / array()
            / string()
            / number()
            / "true"  { serde_json::Value::Bool(true) }
            / "false" { serde_json::Value::Bool(false) }
            / "null"  { serde_json::Value::Null }

        rule object() -> serde_json::Value
            = "{" members:member()* "}" { serde_json::Value::Object(members.into_iter().collect()) }

        rule member() -> (String, serde_json::Value)
            = k:string() ":" v:value() { (k, v) }

        rule array() -> serde_json::Value
            = "[" values:value() ** "," "]" { serde_json::Value::Array(values) }

        rule string() -> String
            = "\"" s:$([^"\\"]+) "\"" { s.to_string() }

        rule number() -> serde_json::Value
            = n:$("-"? ['0'..='9']+ ("." ['0'..='9']+)?) { serde_json::Value::Number(n.parse().unwrap()) }
    }
}
  1. Simple CSV parser:
peg::parser!{
    grammar csv_parser() for str {
        pub rule file() -> Vec<Vec<String>>
            = row() ** "\n"

        rule row() -> Vec<String>
            = value() ** ","

        rule value() -> String
            = quoted_value()
            / unquoted_value()

        rule quoted_value() -> String
            = "\"" v:$([^"\\] / "\\\"")*  "\"" { v.replace("\\\"", "\"") }

        rule unquoted_value() -> String
            = v:$([^,\n]*) { v.to_string() }
    }
}

Getting Started

To use rust-peg in your Rust project, add the following to your Cargo.toml:

[dependencies]
peg = "0.8"

[build-dependencies]
peg = "0.8"

Then, create a grammar file (e.g., src/my_grammar.rs) and use the peg::parser! macro to define

Competitor Comparisons

4,698

The Elegant Parser

Pros of pest

  • Better performance due to its focus on compile-time parsing
  • More extensive documentation and examples
  • Active development and community support

Cons of pest

  • Steeper learning curve for beginners
  • Less flexible syntax compared to PEG

Code Comparison

pest:

use pest::Parser;

#[derive(Parser)]
#[grammar = "grammar.pest"]
pub struct MyParser;

fn main() {
    let pairs = MyParser::parse(Rule::expression, "1 + 2 * 3").unwrap();
}

rust-peg:

peg::parser!{
    grammar calculator() for str {
        rule expression() -> i64
            = sum()
        rule sum() -> i64
            = l:product() "+" r:product() { l + r }
    }
}

Both pest and rust-peg are parsing libraries for Rust, offering different approaches to grammar definition and parsing. pest focuses on performance and compile-time parsing, while rust-peg provides a more flexible PEG-based syntax. pest has more extensive documentation and active community support, but it may have a steeper learning curve for beginners. rust-peg offers a more intuitive syntax for those familiar with PEG grammars but may have less optimal performance in some cases. The choice between the two depends on specific project requirements and developer preferences.

3,106

LR(1) parser generator for Rust

Pros of LALRPOP

  • Supports LR(1) parsing, allowing for more complex grammars
  • Generates faster parsers compared to PEG-based parsers
  • Provides better error reporting and recovery mechanisms

Cons of LALRPOP

  • Steeper learning curve due to its more complex grammar specification
  • Requires separate lexer implementation for tokenization
  • May produce larger generated code compared to rust-peg

Code Comparison

rust-peg example:

pub rule number() -> i32
    = n:$([0-9]+) { n.parse().unwrap() }

LALRPOP example:

pub Number: i32 = {
    r"[0-9]+" => i32::from_str(<>).unwrap()
};

Both examples demonstrate parsing a number, but LALRPOP requires a separate lexer definition for tokenization, while rust-peg handles it inline.

LALRPOP is better suited for more complex grammars and generates faster parsers, while rust-peg offers a simpler syntax and easier integration for smaller projects. The choice between them depends on the specific requirements of your parsing task and the complexity of the grammar you need to handle.

9,568

Rust parser combinator framework

Pros of nom

  • More flexible and powerful, allowing for complex parsing scenarios
  • Better performance, especially for larger inputs
  • Extensive ecosystem with many pre-built parsers and combinators

Cons of nom

  • Steeper learning curve due to its more complex API
  • More verbose syntax, requiring more code for simple parsing tasks
  • Can be overkill for simpler parsing needs

Code Comparison

nom example:

use nom::{
  IResult,
  bytes::complete::tag,
  sequence::tuple
};

fn parser(input: &str) -> IResult<&str, (&str, &str)> {
  tuple((tag("Hello"), tag(" world!")))(input)
}

rust-peg example:

peg::parser!{
  grammar parser() for str {
    rule hello_world() -> () = "Hello" " world!"
  }
}

The nom example demonstrates its more verbose but flexible approach, while the rust-peg example showcases its concise and declarative syntax. nom requires explicit combination of parsers, whereas rust-peg uses a more grammar-like structure.

Both libraries have their strengths: nom excels in complex parsing scenarios and performance-critical applications, while rust-peg shines in simplicity and ease of use for straightforward parsing tasks. The choice between them depends on the specific requirements of the project and the developer's familiarity with each library's paradigm.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Parsing Expression Grammars in Rust

Documentation | Release Notes

rust-peg is a simple yet flexible parser generator that makes it easy to write robust parsers. Based on the Parsing Expression Grammar formalism, it provides a Rust macro that builds a recursive descent parser from a concise definition of the grammar.

Features

  • Parse input from &str, &[u8], &[T] or custom types implementing traits
  • Customizable reporting of parse errors
  • Rules can accept arguments to create reusable rule templates
  • Precedence climbing for prefix/postfix/infix expressions
  • Helpful rustc error messages for errors in the grammar definition or the Rust code embedded within it
  • Rule-level tracing to debug grammars

Example

Parse a comma-separated list of numbers surrounded by brackets into a Vec<u32>:

peg::parser!{
  grammar list_parser() for str {
    rule number() -> u32
      = n:$(['0'..='9']+) {? n.parse().or(Err("u32")) }

    pub rule list() -> Vec<u32>
      = "[" l:(number() ** ",") "]" { l }
  }
}

pub fn main() {
    assert_eq!(list_parser::list("[1,1,2,3,5,8]"), Ok(vec![1, 1, 2, 3, 5, 8]));
}

See the tests for more examples
Grammar rule syntax reference in rustdoc

Comparison with similar parser generators

crateparser typeaction codeintegrationinput typeprecedence climbingparameterized rulesstreaming input
pegPEGin grammarproc macro (block)&str, &[T], customYesYesNo
pestPEGexternalproc macro (file)&strYesNoNo
nomcombinatorsin sourcelibrary&[u8], customNoYesYes
lalrpopLR(1)in grammarbuild script&strNoYesNo

See also

Development

The rust-peg grammar is written in rust-peg: peg-macros/grammar.rustpeg. To avoid the circular dependency, a precompiled grammar is checked in as peg-macros/grammar.rs. To regenerate this, run the ./bootstrap.sh script.

There is a large test suite which uses trybuild to test both functionality (tests/run-pass) and error messages for incorrect grammars (tests/compile-fail). Because rustc error messages change, the compile-fail tests are only run on the minimum supported Rust version to avoid spurious failures.

Use cargo test to run the entire suite, or cargo test -- trybuild trybuild=lifetimes.rs to test just the indicated file. Add --features trace to trace these tests.