Convert Figma logo to code with AI

DoctorWkt logoacwj

A Compiler Writing Journey

10,422
1,010
10,422
26

Top Related Projects

9,633

C in four functions

Write a simple interpreter of C. Inspired by c4 and largely based on it.

2,000

The lcc retargetable ANSI C compiler

Quick Overview

The DoctorWkt/acwj repository is a comprehensive tutorial on building a compiler from scratch. It guides readers through the process of creating a compiler for a subset of C, progressing from a simple calculator to a fully functional compiler. The project is divided into multiple parts, each focusing on different aspects of compiler construction.

Pros

  • Detailed step-by-step explanation of compiler construction
  • Practical approach with working code examples
  • Covers a wide range of compiler topics, from lexical analysis to code generation
  • Suitable for both beginners and intermediate programmers interested in compilers

Cons

  • Focuses on a subset of C, not a full C compiler
  • May not cover some advanced optimization techniques
  • Requires a significant time investment to work through all parts
  • Some parts may be challenging for absolute beginners in programming

Getting Started

To get started with the acwj project:

  1. Clone the repository:

    git clone https://github.com/DoctorWkt/acwj.git
    
  2. Navigate to the desired part:

    cd acwj/part01
    
  3. Read the README.md file in each part for instructions on building and running the code.

  4. Follow the tutorial in order, starting from part01 and progressing through the subsequent parts.

  5. Experiment with the code and try to implement the suggested exercises at the end of each part.

Note: This project is primarily an educational resource and not a code library, so there are no specific code examples or quick start instructions for using it as a library.

Competitor Comparisons

9,633

C in four functions

Pros of c4

  • Extremely compact and concise implementation (less than 500 lines of code)
  • Self-contained in a single C file, making it easy to understand and modify
  • Demonstrates core compiler concepts with minimal complexity

Cons of c4

  • Limited language support compared to acwj's more comprehensive approach
  • Lacks detailed explanations and documentation found in acwj
  • May be too simplified for learning advanced compiler techniques

Code Comparison

c4:

int *id_name, id_type;
int *id = id_name = malloc(sizeof(int) * (ID_SIZE + 1));
while (tk = *p) {
    p++; Putchar(tk);
    if (tk == 'a' || tk == 'b' || tk == 'c' || tk == 'd' || tk == 'e' || tk == 'f') *id++ = tk;
    else if (tk >= '0' && tk <= '9') { ty = INT; *id++ = tk; }
}

acwj:

static struct ASTnode *primary(void) {
  struct ASTnode *n;
  int id;

  switch (Token.token) {
    case T_INTLIT:
      n = mkastleaf(A_INTLIT, Token.intvalue);
      scan(&Token);
      return (n);
    case T_IDENT:
      id = findglob(Text);
      if (id == -1)
        fatals("Unknown variable", Text);
      n = mkastleaf(A_IDENT, id);
      scan(&Token);
      return (n);
    default:
      fatald("Syntax error, token", Token.token);
  }
}

Write a simple interpreter of C. Inspired by c4 and largely based on it.

Pros of write-a-C-interpreter

  • Focuses specifically on building a C interpreter, providing a more targeted learning experience
  • Includes a step-by-step tutorial in the README, making it easier for beginners to follow along
  • Implements a complete C interpreter in a single file, offering a more compact and self-contained project

Cons of write-a-C-interpreter

  • Less comprehensive coverage of compiler concepts compared to acwj
  • Lacks the detailed explanations and documentation found in acwj's accompanying blog posts
  • May not provide as much insight into real-world compiler development practices

Code Comparison

write-a-C-interpreter:

void expression(int level) {
    // ... (implementation details)
}

acwj:

struct ASTnode *binexpr(int ptp) {
    struct ASTnode *left, *right;
    int tokentype;
    // ... (implementation details)
}

The code snippets show different approaches to parsing expressions. write-a-C-interpreter uses a single function with a level parameter, while acwj employs a more structured approach with separate AST nodes and token types.

2,000

The lcc retargetable ANSI C compiler

Pros of lcc

  • More mature and feature-complete C compiler
  • Extensively documented with a companion book
  • Wider platform support and portability

Cons of lcc

  • Less beginner-friendly due to complexity
  • Not actively maintained (last commit in 2015)
  • Larger codebase, potentially harder to understand fully

Code Comparison

lcc (from lcc.c):

int main(int argc, char *argv[]) {
    int i, j, nf;
    static int initted;

    progname = argv[0];
    if (initted++ == 0) {
        // ... initialization code ...
    }
    // ... more code ...
}

acwj (from main.c):

int main(int argc, char *argv[]) {
  int i;

  // Initialise our variables
  init_vars();

  // Scan and parse the input file
  scan(&Token);
  genpreamble();
  statements();
  genpostamble();
  fclose(Outfile);
  exit(0);
}

Summary

lcc is a more comprehensive C compiler with broader support, while acwj is a step-by-step tutorial for building a simple compiler. lcc offers a complete solution but may be overwhelming for beginners, whereas acwj provides a gradual learning experience with a focus on educational value.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

A Compiler Writing Journey

In this Github repository, I'm documenting my journey to write a self-compiling compiler for a subset of the C language. I'm also writing out the details so that, if you want to follow along, there will be an explanation of what I did, why, and with some references back to the theory of compilers.

But not too much theory, I want this to be a practical journey.

Here are the steps I've taken so far:

There isn't a schedule or timeline for the future parts, so just keep checking back here to see if I've written any more.

Copyrights

I have borrowed some of the code, and lots of ideas, from the SubC compiler written by Nils M Holm. His code is in the public domain. I think that my code is substantially different enough that I can apply a different license to my code.

Unless otherwise noted,

  • all source code and scripts are (c) Warren Toomey under the GPL3 license.
  • all non-source code documents (e.g. English documents, image files) are (c) Warren Toomey under the Creative Commons BY-NC-SA 4.0 license.