Compiler Design Project

Syntax Analysis
Detection System

An interactive exploration of parsing techniques, context-free grammars, and syntax tree visualization — the heart of every compiler.

Try the Parser Learn More

0 Parser Types

0 Grammar Rules

0 Test Cases

parser.js

function parseExpr() {
  let node = parseTerm();
  while (token === '+' || token === '-') {
    let op = token;
    advance();
    node = { type: 'BinaryExpr',
             op, left: node,
             right: parseTerm() };
  }
  return node;
}

01 — Overview

Introduction to Syntax Analysis

Understanding the backbone of compiler design

🔍

What is Syntax Analysis?

Syntax analysis (parsing) is the second phase of a compiler. It takes a stream of tokens from the lexical analyzer and determines whether the token sequence can be generated by the grammar of the source language.

The parser constructs a parse tree (or syntax tree) that represents the hierarchical syntactic structure of the source program, verifying that the program follows the language's grammatical rules.

⚙️

Role in Compiler Design

The syntax analyzer sits between the lexical analyzer (scanner) and the semantic analyzer in the compiler pipeline.

Source Code

→

Lexer Tokens

→

Parser Parse Tree

→

Semantic AST

→

Code Gen Output

⚡

Lexical Analysis vs. Syntax Analysis

Aspect	Lexical Analysis	Syntax Analysis
Phase	First phase	Second phase
Input	Character stream	Token stream
Output	Tokens	Parse tree / AST
Grammar	Regular expressions	Context-Free Grammar (CFG)
Errors	Invalid characters, malformed tokens	Structural errors, missing operators
Tool	Finite Automaton	Pushdown Automaton
Example	`123` → `NUM(123)`	`NUM + NUM` → valid expression

02 — Formal Language

Context-Free Grammar Definition

The formal rules our parser uses to validate expressions

CFG for Arithmetic Expressions

A context-free grammar G = (V, Σ, P, S) where:

V (Variables): {Expr, Term, Factor, Primary}

Σ (Terminals): {NUMBER, ID, +, -, *, /, ^, (, ), unary-}

S (Start symbol): Expr

Production Rules (P):

1. Expr → Expr + Term

2. Expr → Expr − Term

3. Expr → Term

4. Term → Term * Factor

5. Term → Term / Factor

6. Term → Factor

7. Factor → Primary ^ Factor

8. Factor → Primary

9. Primary → NUMBER

10. Primary → ID

11. Primary → ( Expr )

12. Primary → − Primary

Operator Precedence

The grammar encodes precedence through its layered structure:

Highest Unary minus −x

High Exponentiation ^ (right-assoc)

Medium Multiply *, Divide /

Low Add +, Subtract −

Grammar Properties

Unambiguous: Every valid string has exactly one parse tree
Left-recursive eliminated: Parser uses iterative loops to handle left-recursion
Right-recursive for ^: Exponentiation is right-associative as intended
Supports grouping: Parentheses override default precedence

03 — Interactive Demo

Parser Implementation

Enter an expression to see parsing in action — live tokenization, step-by-step parsing, and tree visualization

Enter Expression

Quick examples:

🔄 Recursive Descent Parser

A top-down parser that uses a set of mutually recursive procedures — one for each non-terminal in the grammar. It processes input from left to right and constructs a leftmost derivation.

✅ Advantages

Easy to implement and understand
Mirrors grammar structure directly
Excellent error reporting
Can handle complex semantic actions

⚠️ Limitations

Cannot handle left-recursive grammars directly
May require backtracking
Limited to LL grammars

📊 LL(1) Parser

A table-driven predictive parser that reads input Left-to-right, produces a Leftmost derivation, and uses 1 token of lookahead. It uses FIRST and FOLLOW sets to build a parsing table.

✅ Advantages

No recursion overhead
Predictable O(n) performance
Grammar-agnostic implementation

⚠️ Limitations

Requires grammar transformation
Cannot handle ambiguous grammars
Larger code for table construction

📚 LR(0) Parser

A bottom-up parser that reads input Left-to-right and constructs a Rightmost derivation in reverse. Uses shift-reduce operations with a state stack.

✅ Advantages

Handles a wider class of grammars
Can parse left-recursive grammars
Very powerful and general

⚠️ Limitations

Complex to implement manually
Harder to provide good error messages
Large parsing tables

04 — Validation

Test Cases & Results

Comprehensive testing with valid and invalid inputs

0 passed | 0 failed

05 — Real World

Applications of Syntax Analysis

Where parsing powers the tools we use every day

🛠️

Compilers & Interpreters

Every programming language compiler (GCC, Clang, javac) uses syntax analysis as a core phase. The parser converts source code into ASTs that enable optimization and code generation.

GCC LLVM V8 Engine

📝

Code Editors & IDEs

VS Code, IntelliJ, and other IDEs use incremental parsers for syntax highlighting, auto-completion, code folding, and real-time error detection as you type.

VS Code Tree-sitter LSP

🔎

Linters & Formatters

Tools like ESLint, Prettier, and Black parse source code to enforce coding standards, detect potential bugs, and automatically format code to consistent styles.

ESLint Prettier Black

🗄️

Database Query Engines

SQL engines parse queries to build query execution plans. The parser validates syntax, resolves table references, and enables query optimization.

PostgreSQL MySQL SQLite

🌐

Web Browsers

Browsers parse HTML, CSS, and JavaScript using specialized parsers. The HTML parser builds the DOM tree, CSS parser creates CSSOM, and JS parsers compile code to bytecode.

Blink Gecko WebKit

🤖

Natural Language Processing

NLP systems use parsers to analyze sentence structure, extract meaning, and understand relationships between words — powering chatbots, translators, and voice assistants.

spaCy NLTK Stanford NLP

06 — Summary

Conclusion

Key takeaways and future directions

📋 Project Summary

This project demonstrated the design and implementation of a comprehensive syntax analysis detection system. We explored three major parsing techniques — Recursive Descent, LL(1), and LR(0) — all working with a well-defined context-free grammar for arithmetic expressions.

Key accomplishments include:

Defined a 12-rule CFG supporting arithmetic operations with proper operator precedence and associativity
Implemented three distinct parsing algorithms with step-by-step tracing
Built comprehensive syntax error detection with meaningful, positional error messages
Created interactive parse tree visualization using HTML5 Canvas
Validated the system with 15+ test cases covering valid and invalid inputs

🎓 Key Learnings

Grammar design matters: A well-structured grammar directly enables clean parser implementation and correct precedence handling
Top-down vs Bottom-up: Each approach has tradeoffs — recursive descent is intuitive but limited; LR parsing is powerful but complex
Error recovery is crucial: Real-world parsers need robust error recovery to provide useful feedback, not just fail on first error
Visualization aids understanding: Parse trees make abstract syntactic structure tangible and debuggable

🚀 Possible Improvements

Extend grammar to support assignment statements, conditionals, and loops (mini programming language)
Implement error recovery strategies (panic mode, phrase-level recovery)
Add semantic analysis phase (type checking, scope resolution)
Support LALR(1) and SLR(1) parsing algorithms
Generate intermediate code (three-address code) from the AST
Integrate with Lex/Yacc or ANTLR for comparison

Syntax Analysis Detection System

Introduction to Syntax Analysis

What is Syntax Analysis?

Role in Compiler Design

Lexical Analysis vs. Syntax Analysis

Context-Free Grammar Definition

CFG for Arithmetic Expressions

Production Rules (P):

Operator Precedence

Grammar Properties

Parser Implementation

Error Details

🔄 Recursive Descent Parser

✅ Advantages

⚠️ Limitations

📊 LL(1) Parser

✅ Advantages

⚠️ Limitations

📚 LR(0) Parser

✅ Advantages

⚠️ Limitations

Test Cases & Results

Applications of Syntax Analysis

Compilers & Interpreters

Code Editors & IDEs

Linters & Formatters

Database Query Engines

Web Browsers

Natural Language Processing

Conclusion

📋 Project Summary

🎓 Key Learnings

🚀 Possible Improvements

Syntax Analysis
Detection System