Top Related Projects
Quick Overview
rust-peg is a Parsing Expression Grammar (PEG) parser generator for Rust. It allows developers to define grammars using a simple syntax and generates efficient Rust code for parsing input based on those grammars. This library simplifies the process of creating parsers for complex text formats or domain-specific languages.
Pros
- Easy-to-use syntax for defining grammars
- Generates efficient Rust code for parsing
- Integrates well with Rust's macro system
- Supports error reporting and recovery
Cons
- Limited documentation and examples
- May have a steeper learning curve for those unfamiliar with PEG parsers
- Performance can be slower compared to hand-written parsers for simple grammars
Code Examples
- Basic arithmetic expression parser:
peg::parser!{
grammar arithmetic_parser() for str {
rule number() -> i32
= n:$(['0'..='9']+) { n.parse().unwrap() }
pub rule expression() -> i32
= l:term() "+" r:expression() { l + r }
/ l:term() "-" r:expression() { l - r }
/ term()
rule term() -> i32
= l:factor() "*" r:term() { l * r }
/ l:factor() "/" r:term() { l / r }
/ factor()
rule factor() -> i32
= "(" e:expression() ")" { e }
/ number()
}
}
fn main() {
println!("{}", arithmetic_parser::expression("2 + 3 * 4").unwrap());
}
- Simple JSON parser:
peg::parser!{
grammar json_parser() for str {
rule value() -> serde_json::Value
= object()
/ array()
/ string()
/ number()
/ "true" { serde_json::Value::Bool(true) }
/ "false" { serde_json::Value::Bool(false) }
/ "null" { serde_json::Value::Null }
rule object() -> serde_json::Value
= "{" members:member()* "}" { serde_json::Value::Object(members.into_iter().collect()) }
rule member() -> (String, serde_json::Value)
= k:string() ":" v:value() { (k, v) }
rule array() -> serde_json::Value
= "[" values:value() ** "," "]" { serde_json::Value::Array(values) }
rule string() -> String
= "\"" s:$([^"\\"]+) "\"" { s.to_string() }
rule number() -> serde_json::Value
= n:$("-"? ['0'..='9']+ ("." ['0'..='9']+)?) { serde_json::Value::Number(n.parse().unwrap()) }
}
}
- Simple CSV parser:
peg::parser!{
grammar csv_parser() for str {
pub rule file() -> Vec<Vec<String>>
= row() ** "\n"
rule row() -> Vec<String>
= value() ** ","
rule value() -> String
= quoted_value()
/ unquoted_value()
rule quoted_value() -> String
= "\"" v:$([^"\\] / "\\\"")* "\"" { v.replace("\\\"", "\"") }
rule unquoted_value() -> String
= v:$([^,\n]*) { v.to_string() }
}
}
Getting Started
To use rust-peg in your Rust project, add the following to your Cargo.toml
:
[dependencies]
peg = "0.8"
[build-dependencies]
peg = "0.8"
Then, create a grammar file (e.g., src/my_grammar.rs
) and use the peg::parser!
macro to define
Competitor Comparisons
The Elegant Parser
Pros of pest
- Better performance due to its focus on compile-time parsing
- More extensive documentation and examples
- Active development and community support
Cons of pest
- Steeper learning curve for beginners
- Less flexible syntax compared to PEG
Code Comparison
pest:
use pest::Parser;
#[derive(Parser)]
#[grammar = "grammar.pest"]
pub struct MyParser;
fn main() {
let pairs = MyParser::parse(Rule::expression, "1 + 2 * 3").unwrap();
}
rust-peg:
peg::parser!{
grammar calculator() for str {
rule expression() -> i64
= sum()
rule sum() -> i64
= l:product() "+" r:product() { l + r }
}
}
Both pest and rust-peg are parsing libraries for Rust, offering different approaches to grammar definition and parsing. pest focuses on performance and compile-time parsing, while rust-peg provides a more flexible PEG-based syntax. pest has more extensive documentation and active community support, but it may have a steeper learning curve for beginners. rust-peg offers a more intuitive syntax for those familiar with PEG grammars but may have less optimal performance in some cases. The choice between the two depends on specific project requirements and developer preferences.
LR(1) parser generator for Rust
Pros of LALRPOP
- Supports LR(1) parsing, allowing for more complex grammars
- Generates faster parsers compared to PEG-based parsers
- Provides better error reporting and recovery mechanisms
Cons of LALRPOP
- Steeper learning curve due to its more complex grammar specification
- Requires separate lexer implementation for tokenization
- May produce larger generated code compared to rust-peg
Code Comparison
rust-peg example:
pub rule number() -> i32
= n:$([0-9]+) { n.parse().unwrap() }
LALRPOP example:
pub Number: i32 = {
r"[0-9]+" => i32::from_str(<>).unwrap()
};
Both examples demonstrate parsing a number, but LALRPOP requires a separate lexer definition for tokenization, while rust-peg handles it inline.
LALRPOP is better suited for more complex grammars and generates faster parsers, while rust-peg offers a simpler syntax and easier integration for smaller projects. The choice between them depends on the specific requirements of your parsing task and the complexity of the grammar you need to handle.
Rust parser combinator framework
Pros of nom
- More flexible and powerful, allowing for complex parsing scenarios
- Better performance, especially for larger inputs
- Extensive ecosystem with many pre-built parsers and combinators
Cons of nom
- Steeper learning curve due to its more complex API
- More verbose syntax, requiring more code for simple parsing tasks
- Can be overkill for simpler parsing needs
Code Comparison
nom example:
use nom::{
IResult,
bytes::complete::tag,
sequence::tuple
};
fn parser(input: &str) -> IResult<&str, (&str, &str)> {
tuple((tag("Hello"), tag(" world!")))(input)
}
rust-peg example:
peg::parser!{
grammar parser() for str {
rule hello_world() -> () = "Hello" " world!"
}
}
The nom example demonstrates its more verbose but flexible approach, while the rust-peg example showcases its concise and declarative syntax. nom requires explicit combination of parsers, whereas rust-peg uses a more grammar-like structure.
Both libraries have their strengths: nom excels in complex parsing scenarios and performance-critical applications, while rust-peg shines in simplicity and ease of use for straightforward parsing tasks. The choice between them depends on the specific requirements of the project and the developer's familiarity with each library's paradigm.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Parsing Expression Grammars in Rust
rust-peg
is a simple yet flexible parser generator that makes it easy to write robust parsers. Based on the Parsing Expression Grammar formalism, it provides a Rust macro that builds a recursive descent parser from a concise definition of the grammar.
Features
- Parse input from
&str
,&[u8]
,&[T]
or custom types implementing traits - Customizable reporting of parse errors
- Rules can accept arguments to create reusable rule templates
- Precedence climbing for prefix/postfix/infix expressions
- Helpful
rustc
error messages for errors in the grammar definition or the Rust code embedded within it - Rule-level tracing to debug grammars
Example
Parse a comma-separated list of numbers surrounded by brackets into a Vec<u32>
:
peg::parser!{
grammar list_parser() for str {
rule number() -> u32
= n:$(['0'..='9']+) {? n.parse().or(Err("u32")) }
pub rule list() -> Vec<u32>
= "[" l:(number() ** ",") "]" { l }
}
}
pub fn main() {
assert_eq!(list_parser::list("[1,1,2,3,5,8]"), Ok(vec![1, 1, 2, 3, 5, 8]));
}
See the tests for more examples
Grammar rule syntax reference in rustdoc
Comparison with similar parser generators
crate | parser type | action code | integration | input type | precedence climbing | parameterized rules | streaming input |
---|---|---|---|---|---|---|---|
peg | PEG | in grammar | proc macro (block) | &str , &[T] , custom | Yes | Yes | No |
pest | PEG | external | proc macro (file) | &str | Yes | No | No |
nom | combinators | in source | library | &[u8] , custom | No | Yes | Yes |
lalrpop | LR(1) | in grammar | build script | &str | No | Yes | No |
See also
- pegviz is a UI for visualizing rust-peg's trace output to debug parsers.
- There exist several crates to format diagnostic messages on source code snippets in the terminal, including chic, annotate-snippets, codespan-reporting, and codemap-diagnostic.
Development
The rust-peg
grammar is written in rust-peg
: peg-macros/grammar.rustpeg
. To avoid the circular dependency, a precompiled grammar is checked in as peg-macros/grammar.rs
. To regenerate this, run the ./bootstrap.sh
script.
There is a large test suite which uses trybuild
to test both functionality (tests/run-pass
) and error messages for incorrect grammars (tests/compile-fail
). Because rustc
error messages change, the compile-fail
tests are only run on the minimum supported Rust version to avoid spurious failures.
Use cargo test
to run the entire suite,
or cargo test -- trybuild trybuild=lifetimes.rs
to test just the indicated file.
Add --features trace
to trace these tests.
Top Related Projects
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot