Convert Figma logo to code with AI

javacc logojavacc

JavaCC - a parser generator for building parsers from grammars. It can generate code in Java, C++ and C#.

1,178
247
1,178
77

Top Related Projects

17,097

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

4,343

Bison in JavaScript.

4,808

PEG.js: Parser generator for JavaScript

Quick Overview

JavaCC (Java Compiler Compiler) is an open-source parser generator and lexical analyzer generator for use with Java applications. It reads a grammar specification and converts it to a Java program that can recognize matches to the grammar. JavaCC is particularly useful for building compilers, interpreters, and other language-related tools.

Pros

  • Easy to use and learn, with a syntax similar to EBNF (Extended Backus-Naur Form)
  • Generates human-readable Java code, making it easier to debug and maintain
  • Supports lookahead, making it more powerful than simple LL(1) parsers
  • Includes additional tools like JJTree for parse tree generation

Cons

  • Performance can be slower compared to hand-written parsers or some other parser generators
  • Limited to LL(k) grammars, which can be restrictive for some complex language structures
  • Documentation can be outdated or lacking in some areas
  • Less active development compared to some newer parser generators

Code Examples

  1. Simple arithmetic expression parser:
PARSER_BEGIN(Calculator)
public class Calculator {
  public static void main(String args[]) throws ParseException {
    Calculator parser = new Calculator(System.in);
    parser.Start();
  }
}
PARSER_END(Calculator)

SKIP : { " " | "\t" | "\n" | "\r" }
TOKEN : { < PLUS : "+" > | < MINUS : "-" > | < MULTIPLY : "*" > | < DIVIDE : "/" > }
TOKEN : { < NUMBER : (["0"-"9"])+ > }

void Start() :
{}
{
  Expression() <EOF>
}

void Expression() :
{}
{
  Term() ( ( <PLUS> | <MINUS> ) Term() )*
}

void Term() :
{}
{
  Factor() ( ( <MULTIPLY> | <DIVIDE> ) Factor() )*
}

void Factor() :
{}
{
  <NUMBER>
}
  1. Simple JSON parser:
PARSER_BEGIN(JSONParser)
public class JSONParser {
  public static void main(String args[]) throws ParseException {
    JSONParser parser = new JSONParser(System.in);
    parser.json();
  }
}
PARSER_END(JSONParser)

SKIP : { " " | "\t" | "\n" | "\r" }
TOKEN : {
  < LCURLY : "{" > | < RCURLY : "}" > | < LSQUARE : "[" > | < RSQUARE : "]" > |
  < COMMA : "," > | < COLON : ":" > |
  < TRUE : "true" > | < FALSE : "false" > | < NULL : "null" > |
  < STRING : "\"" (~["\"","\\","\n","\r"] | "\\" (["n","t","b","r","f","\\","'","\""] | ["0"-"7"] (["0"-"7"])? | ["0"-"3"] ["0"-"7"] ["0"-"7"]))* "\"" > |
  < NUMBER : ("-")? (["0"-"9"])+ ("." (["0"-"9"])+)? (["e","E"] (["+","-"])? (["0"-"9"])+)? >
}

void json() : {} { object() | array() }

void object() : {} { <LCURLY> (pair() (<COMMA> pair())*)? <RCURLY> }

void pair() : {} { <STRING> <COLON> value() }

void array() : {} { <LSQUARE> (value() (<COMMA> value())*)? <RSQUARE> }

void value() : {} {
  <STRING> | <NUMBER> | object() | array() | <TRUE> | <FALSE> | <NULL>
}

Getting Started

  1. Download JavaCC from the official website or use a package manager.
  2. Create a grammar file (e.g., MyParser.jj) with your parser specification.
  3. Run JavaCC on your grammar file:
    javacc MyParser.jj
    
  4. Compile the generate

Competitor Comparisons

17,097

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

Pros of ANTLR4

  • More powerful and flexible grammar definition capabilities
  • Better support for multiple target languages
  • Extensive documentation and community support

Cons of ANTLR4

  • Steeper learning curve for beginners
  • Potentially slower parsing speed for certain grammars

Code Comparison

ANTLR4 grammar example:

grammar Expression;
expr : term (('+' | '-') term)*;
term : factor (('*' | '/') factor)*;
factor : NUMBER | '(' expr ')';
NUMBER : [0-9]+;
WS : [ \t\r\n]+ -> skip;

JavaCC grammar example:

PARSER_BEGIN(ExpressionParser)
public class ExpressionParser {}
PARSER_END(ExpressionParser)

TOKEN : { < NUMBER: (["0"-"9"])+ > }
SKIP : { " " | "\t" | "\n" | "\r" }

void expr() : {}
{
  term() ( ("+" | "-") term() )*
}

void term() : {}
{
  factor() ( ("*" | "/") factor() )*
}

void factor() : {}
{
  <NUMBER> | "(" expr() ")"
}

Both ANTLR4 and JavaCC are powerful parser generators, but ANTLR4 offers more flexibility and language support. JavaCC may be easier for Java developers to pick up initially. The choice between them often depends on specific project requirements and developer preferences.

4,343

Bison in JavaScript.

Pros of Jison

  • Written in JavaScript, making it more accessible for web developers
  • Supports both LR and LALR parsing algorithms
  • Easier integration with Node.js projects

Cons of Jison

  • Less mature and less widely used compared to JavaCC
  • Limited documentation and community support
  • May have performance limitations for large-scale parsing tasks

Code Comparison

JavaCC example:

PARSER_BEGIN(SimpleParser)
public class SimpleParser {
  public static void main(String args[]) throws ParseException {
    SimpleParser parser = new SimpleParser(System.in);
    parser.Input();
  }
}
PARSER_END(SimpleParser)

Jison example:

%lex
%%
\s+                   /* skip whitespace */
[0-9]+                return 'NUMBER'
/lex

%%
expressions
    : e EOF
    ;

e
    : e '+' e
    | NUMBER
    ;

Both JavaCC and Jison are parser generators, but they cater to different ecosystems. JavaCC is more established and widely used in Java projects, while Jison is better suited for JavaScript and Node.js environments. The choice between them often depends on the target language and specific project requirements.

4,808

PEG.js: Parser generator for JavaScript

Pros of PEG.js

  • JavaScript-based, making it easier to integrate with web applications
  • Generates parsers that can run in both Node.js and browsers
  • Simpler syntax and more intuitive grammar definition

Cons of PEG.js

  • Limited to JavaScript ecosystem, less versatile than JavaCC
  • May have performance limitations for very large grammars
  • Less mature and with fewer advanced features compared to JavaCC

Code Comparison

JavaCC example:

PARSER_BEGIN(SimpleExprParser)
public class SimpleExprParser {
  public static void main(String args[]) throws ParseException {
    SimpleExprParser parser = new SimpleExprParser(System.in);
    parser.Start();
  }
}
PARSER_END(SimpleExprParser)

PEG.js example:

start
  = additive

additive
  = left:multiplicative "+" right:additive { return left + right; }
  / multiplicative

multiplicative
  = left:primary "*" right:multiplicative { return left * right; }
  / primary

Both examples show basic parser structure, but JavaCC requires more boilerplate code, while PEG.js focuses on grammar definition using a more concise syntax.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

JavaCC

Maven Central Javadocs

Java Compiler Compiler (JavaCC) is the most popular parser generator for use with Java applications.

A parser generator is a tool that reads a grammar specification and converts it to a Java program that can recognize matches to the grammar.

In addition to the parser generator itself, JavaCC provides other standard capabilities related to parser generation such as tree building (via a tool called JJTree included with JavaCC), actions and debugging.

All you need to run a JavaCC parser, once generated, is a Java Runtime Environment (JRE).

This README is meant as a brief overview of the core features and how to set things up to get yourself started with JavaCC. For a fully detailed documentation, please see https://javacc.github.io/javacc/.

Contents

Introduction

Features

  • JavaCC generates top-down (recursive descent) parsers as opposed to bottom-up parsers generated by YACC-like tools. This allows the use of more general grammars, although left-recursion is disallowed. Top-down parsers have a number of other advantages (besides more general grammars) such as being easier to debug, having the ability to parse to any non-terminal in the grammar, and also having the ability to pass values (attributes) both up and down the parse tree during parsing.

  • By default, JavaCC generates an LL(1) parser. However, there may be portions of grammar that are not LL(1). JavaCC offers the capabilities of syntactic and semantic lookahead to resolve shift-shift ambiguities locally at these points. For example, the parser is LL(k) only at such points, but remains LL(1) everywhere else for better performance. Shift-reduce and reduce-reduce conflicts are not an issue for top-down parsers.

  • JavaCC generates parsers that are 100% pure Java, so there is no runtime dependency on JavaCC and no special porting effort required to run on different machine platforms.

  • JavaCC allows extended BNF specifications - such as (A)*, (A)+ etc - within the lexical and the grammar specifications. Extended BNF relieves the need for left-recursion to some extent. In fact, extended BNF is often easier to read as in A ::= y(x)* versus A ::= Ax|y.

  • The lexical specifications (such as regular expressions, strings) and the grammar specifications (the BNF) are both written together in the same file. It makes grammars easier to read since it is possible to use regular expressions inline in the grammar specification, and also easier to maintain.

  • The lexical analyzer of JavaCC can handle full Unicode input, and lexical specifications may also include any Unicode character. This facilitates descriptions of language elements such as Java identifiers that allow certain Unicode characters (that are not ASCII), but not others.

  • JavaCC offers Lex-like lexical state and lexical action capabilities. Specific aspects in JavaCC that are superior to other tools are the first class status it offers concepts such as TOKEN, MORE, SKIP and state changes. This allows cleaner specifications as well as better error and warning messages from JavaCC.

  • Tokens that are defined as special tokens in the lexical specification are ignored during parsing, but these tokens are available for processing by the tools. A useful application of this is in the processing of comments.

  • Lexical specifications can define tokens not to be case-sensitive either at the global level for the entire lexical specification, or on an individual lexical specification basis.

  • JavaCC comes with JJTree, an extremely powerful tree building pre-processor.

  • JavaCC also includes JJDoc, a tool that converts grammar files to documentation files, optionally in HTML.

  • JavaCC offers many options to customize its behavior and the behavior of the generated parsers. Examples of such options are the kinds of Unicode processing to perform on the input stream, the number of tokens of ambiguity checking to perform etc.

  • JavaCC error reporting is among the best in parser generators. JavaCC generated parsers are able to clearly point out the location of parse errors with complete diagnostic information.

  • Using options DEBUG_PARSER, DEBUG_LOOKAHEAD, and DEBUG_TOKEN_MANAGER, users can get in-depth analysis of the parsing and the token processing steps.

  • The JavaCC release includes a wide range of examples including Java and HTML grammars. The examples, along with their documentation, are a great way to get acquainted with JavaCC.

An example

The following JavaCC grammar example recognizes matching braces followed by zero or more line terminators and then an end of file.

Examples of legal strings in this grammar are:

{}, {% raw %}{{{{{}}}}}{% endraw %} // ... etc

Examples of illegal strings are:

&#123;&#125;&#123;&#125;, &#125;&#123;&#125;&#125;, &#123; &#125;, &#123;x&#125; // ... etc

Its grammar
PARSER_BEGIN(Example)

/** Simple brace matcher. */
public class Example {

  /** Main entry point. */
  public static void main(String args[]) throws ParseException {
    Example parser = new Example(System.in);
    parser.Input();
  }

}

PARSER_END(Example)

/** Root production. */
void Input() :
{}
{
  MatchedBraces() ("\n"|"\r")* <EOF>
}

/** Brace matching production. */
void MatchedBraces() :
{}
{
  "{" [ MatchedBraces() ] "}"
}
Some executions and outputs
{{}} gives no error
$ java Example
{{}}<return>
{x gives a Lexical error
$ java Example
{x<return>
Lexical error at line 1, column 2.  Encountered: "x"
TokenMgrError: Lexical error at line 1, column 2.  Encountered: "x" (120), after : ""
        at ExampleTokenManager.getNextToken(ExampleTokenManager.java:146)
        at Example.getToken(Example.java:140)
        at Example.MatchedBraces(Example.java:51)
        at Example.Input(Example.java:10)
        at Example.main(Example.java:6)
{}} gives a ParseException
$ java Example
{}}<return>
ParseException: Encountered "}" at line 1, column 3.
Was expecting one of:
    <EOF>
    "\n" ...
    "\r" ...
        at Example.generateParseException(Example.java:184)
        at Example.jj_consume_token(Example.java:126)
        at Example.Input(Example.java:32)
        at Example.main(Example.java:6)

Getting Started

You can use JavaCC either from the command line or through an IDE.

Use JavaCC from the command line

Download

Download the latest stable release (at least the source and the binaries) in a so called download directory:

All JavaCC releases are available via GitHub and Maven including checksums and cryptographic signatures.

For all previous releases, please see stable releases.

Install

Once you have downloaded the files, navigate to the download directory and unzip the source file, this creating a so called JavaCC installation directory:

$ unzip javacc-7.0.13.zip
or
$ tar xvf javacc-7.0.13.tar.gz

Then move the binary file javacc-7.0.13.jar under the download directory in a new target/ directory under the installation directory, and rename it to javacc.jar.

Then add the scripts/ directory in the JavaCC installation directory to your PATH. The JavaCC, JJTree, and JJDoc invocation scripts/executables reside in this directory.

On UNIX based systems, the scripts may not be executable immediately. This can be solved by using the command from the javacc-7.0.13/ directory:

chmod +x scripts/javacc

Write your grammar and generate your parser

You can then create and edit a grammar file with your favorite text editor.

Then use the appropriate script for generating your parser from your grammar.

Use JavaCC within an IDE

Minimal requirements for an IDE are:

  • Support for Java
  • Support for Maven with Java

IntelliJ IDEA

The IntelliJ IDE supports Maven out of the box and offers a plugin for JavaCC development.

Eclipse IDE

Maven

Add the following dependency to your pom.xml file.

<dependency>
    <groupId>net.java.dev.javacc</groupId>
    <artifactId>javacc</artifactId>
    <version>7.0.13</version>
</dependency>

Gradle

Add the following to your build.gradle file.

repositories {
    mavenLocal()
    maven {
        url = 'https://mvnrepository.com/artifact/net.java.dev.javacc/javacc'
    }
}

dependencies {
    compile group: 'net.java.dev.javacc', name: 'javacc', version: '7.0.13'
}

Rebuilding JavaCC

From the source installation directory

The source installation directory contains the JavaCC, JJTree and JJDoc sources, launcher scripts, example grammars and documentation, and also a bootstrap version of JavaCC needed to build JavaCC.

Prerequisites for building JavaCC with this method:

  • Ant (we require version 1.5.3 or above - you can get ant from http://ant.apache.org)
  • Maven
  • Java 8 (Java 9 and 10 are not yet supported)

Use the ant build script:

$ cd javacc
$ ant

This will build the javacc.jar file in the target/ directory

After cloning the JavaCC GitHub repository

This is the preferred method for contributing to JavaCC.

Prerequisites for building JavaCC with this method:

  • Git
  • Ant (we require version 1.5.3 or above - you can get ant from http://ant.apache.org)
  • Maven
  • Java 8 (Java 9 and 10 are not yet supported)

Just clone the repository and then use the ant build script:

$ git clone https://github.com/javacc/javacc.git
$ cd javacc
$ ant

This will build the javacc.jar file in the target/ directory

Community

JavaCC is by far the most popular parser generator used with Java applications with an estimated user base of over 1,000 users and more than 100,000 downloads to date.

It is maintained by the developer community which includes the original authors and Chris Ainsley, Tim Pizney and Francis Andre.

Support

Don’t hesitate to ask!

Contact the developers and community on the Google user group or email us at JavaCC Support if you need any help.

Open an issue if you found a bug in JavaCC.

For questions relating to development please join our Slack channel.

Documentation

The documentation of JavaCC is located on the website https://javacc.github.io/javacc/ and in the docs/ directory of the source code on GitHub.

It includes detailed documentation for JavaCC, JJTree, and JJDoc.

Resources

Books
  • Dos Reis, Anthony J., Compiler Construction Using Java, JavaCC, and Yacc., Wiley-Blackwell 2012. ISBN 0-4709495-9-7 (book, pdf).
  • Copeland, Tom, Generating Parsers with JavaCC., Centennial Books, 2007. ISBN 0-9762214-3-8 (book).
Tutorials
Articles
Parsing theory
  • Alfred V. Aho, Monica S. Lam, Ravi Sethi and Jeffrey D. Ullman, Compilers: Principles, Techniques, and Tools, 2nd Edition, Addison-Wesley, 2006, ISBN 0-3211314-3-6 (book, pdf).
  • Charles N. Fischer and Richard J. Leblanc, Jr., Crafting a Compiler with C., Pearson, 1991. ISBN 0-8053216-6-7 (book).

Powered by JavaCC

JavaCC is used in many commercial applications and open source projects.

The following list highlights a few notable JavaCC projects that run interesting use cases in production, with links to the relevant grammar specifications.

UserUse CaseGrammar File(s)
Apache ActiveMQParsing JMS selector statementsSelectorParser.jj, HyphenatedParser.jj
Apache AvroParsing higher-level languages into Avro Schemaidl.jj
Apache CalciteParsing SQL statementsParser.jj
Apache CamelParsing stored SQL templatessspt.jj
Apache JenaParsing queries written in SPARQL, ARQ, SSE, Turtle and JSONsparql_10, sparql_11, arq.jj, sse.jj, turtle.jj, json.jj
Apache LuceneParsing search queriesQueryParser.jj
Apache TomcatParsing Expression Language (EL) and JSONELParser.jjt, JSONParser.jj
Apache ZookeeperOptimising serialisation/deserialisation of Hadoop I/O recordsrcc.jj
Java ParserParsing Java language filesjava.jj

License

JavaCC is an open source project released under the BSD License 2.0. The JavaCC project was originally developed at Sun Microsystems Inc. by Sreeni Viswanadha and Sriram Sankar.



Top