Convert Figma logo to code with AI

doctrine logolexer

Base library for a lexer that can be used in Top-Down, Recursive Descent Parsers.

11,055
60
11,055
2

Top Related Projects

A PHP parser written in PHP

16,934

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

Quick Overview

Doctrine Lexer is a PHP library that provides a base class for creating lexical scanners. It's primarily used for tokenizing strings into smaller parts, which is useful in parsing and analyzing text or code. The lexer is designed to be extended for specific use cases.

Pros

  • Simple and lightweight implementation
  • Easily extendable for custom lexing needs
  • Well-integrated with other Doctrine projects
  • Provides a solid foundation for building more complex parsers

Cons

  • Limited built-in token types
  • Requires additional implementation for specific use cases
  • May be overkill for very simple tokenization tasks
  • Documentation could be more comprehensive

Code Examples

  1. Basic usage of the Lexer class:
use Doctrine\Common\Lexer\AbstractLexer;

class MyLexer extends AbstractLexer
{
    protected function getCatchablePatterns()
    {
        return [
            '[a-z]+',
            '\d+',
            '\s+',
        ];
    }

    protected function getNonCatchablePatterns()
    {
        return [];
    }

    protected function getType(&$value)
    {
        if (is_numeric($value)) {
            return 'NUMBER';
        }
        if (ctype_alpha($value)) {
            return 'WORD';
        }
        if (ctype_space($value)) {
            return 'WHITESPACE';
        }
        return 'UNKNOWN';
    }
}

$lexer = new MyLexer();
$lexer->setInput('hello 123 world');
foreach ($lexer as $token) {
    echo $token['type'] . ': ' . $token['value'] . PHP_EOL;
}
  1. Using the lexer to tokenize a simple mathematical expression:
class MathLexer extends AbstractLexer
{
    const T_NUMBER = 1;
    const T_OPERATOR = 2;

    protected function getCatchablePatterns()
    {
        return [
            '\d+',
            '[+\-*/]',
        ];
    }

    protected function getNonCatchablePatterns()
    {
        return ['\s+'];
    }

    protected function getType(&$value)
    {
        if (is_numeric($value)) {
            return self::T_NUMBER;
        }
        if (in_array($value, ['+', '-', '*', '/'])) {
            return self::T_OPERATOR;
        }
        return null;
    }
}

$lexer = new MathLexer();
$lexer->setInput('5 + 3 * 2');
foreach ($lexer as $token) {
    echo $token['type'] === MathLexer::T_NUMBER ? 'NUMBER' : 'OPERATOR';
    echo ': ' . $token['value'] . PHP_EOL;
}

Getting Started

To use Doctrine Lexer in your project, first install it via Composer:

composer require doctrine/lexer

Then, create a custom lexer class by extending Doctrine\Common\Lexer\AbstractLexer:

use Doctrine\Common\Lexer\AbstractLexer;

class CustomLexer extends AbstractLexer
{
    protected function getCatchablePatterns()
    {
        return [
            // Define your patterns here
        ];
    }

    protected function getNonCatchablePatterns()
    {
        return [
            // Define patterns to ignore here
        ];
    }

    protected function getType(&$value)
    {
        // Define your token types here
    }
}

Finally, instantiate your custom lexer and use it to tokenize input:

$lexer = new CustomLexer();
$lexer->setInput('your input string');
foreach ($lexer as $token) {
    // Process tokens
}

Competitor Comparisons

A PHP parser written in PHP

Pros of PHP-Parser

  • More comprehensive parsing capabilities, handling full PHP syntax
  • Supports generating and modifying abstract syntax trees (ASTs)
  • Includes a pretty printer for code generation

Cons of PHP-Parser

  • Larger and more complex, potentially overkill for simple tokenization tasks
  • Steeper learning curve due to its extensive feature set
  • May have higher performance overhead for basic lexing operations

Code Comparison

PHP-Parser:

$parser = (new ParserFactory)->create(ParserFactory::PREFER_PHP7);
$ast = $parser->parse('<?php echo "Hello World!";');

Lexer:

$lexer = new Lexer();
$tokens = $lexer->tokenize('SELECT * FROM users');

Summary

PHP-Parser is a more powerful tool for parsing and manipulating PHP code, offering full syntax support and AST generation. Lexer, on the other hand, is a simpler and more lightweight solution focused on tokenization, making it more suitable for basic lexing tasks. PHP-Parser provides greater flexibility but comes with increased complexity, while Lexer offers simplicity and potentially better performance for straightforward tokenization needs.

16,934

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

Pros of ANTLR4

  • More powerful and feature-rich, supporting complex grammar definitions
  • Generates parsers for multiple target languages (Java, C#, Python, etc.)
  • Extensive documentation and community support

Cons of ANTLR4

  • Steeper learning curve due to its complexity
  • Larger footprint and potentially slower for simple parsing tasks
  • Requires additional runtime dependencies

Code Comparison

Lexer:

// ANTLR4
lexer grammar SimpleLexer;
ID : [a-zA-Z]+;
NUM : [0-9]+;
WS : [ \t\r\n]+ -> skip;

// Doctrine Lexer
class SimpleLexer extends AbstractLexer
{
    protected $catchablePatterns = [
        '[a-zA-Z]+',
        '[0-9]+',
    ];
    protected $ignoredPatterns = [
        '\s+',
    ];
}

ANTLR4 provides a more declarative and concise syntax for defining lexer rules, while Doctrine Lexer requires a more programmatic approach. ANTLR4's grammar files are more expressive and can handle complex language structures, whereas Doctrine Lexer is better suited for simpler lexical analysis tasks in PHP projects.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Doctrine Lexer

Build Status

Base library for a lexer that can be used in Top-Down, Recursive Descent Parsers.

This lexer is used in Doctrine Annotations and in Doctrine ORM (DQL).

https://www.doctrine-project.org/projects/lexer.html