Home Introduction Grammar Parser Testing Applications Conclusion
Compiler Design Project

Syntax Analysis
Detection System

An interactive exploration of parsing techniques, context-free grammars, and syntax tree visualization — the heart of every compiler.

0 Parser Types
0 Grammar Rules
0 Test Cases
parser.js
function parseExpr() {
  let node = parseTerm();
  while (token === '+' || token === '-') {
    let op = token;
    advance();
    node = { type: 'BinaryExpr',
             op, left: node,
             right: parseTerm() };
  }
  return node;
}

Introduction to Syntax Analysis

Understanding the backbone of compiler design

🔍

What is Syntax Analysis?

Syntax analysis (parsing) is the second phase of a compiler. It takes a stream of tokens from the lexical analyzer and determines whether the token sequence can be generated by the grammar of the source language.

The parser constructs a parse tree (or syntax tree) that represents the hierarchical syntactic structure of the source program, verifying that the program follows the language's grammatical rules.

⚙️

Role in Compiler Design

The syntax analyzer sits between the lexical analyzer (scanner) and the semantic analyzer in the compiler pipeline.

Source Code
Lexer Tokens
Parser Parse Tree
Semantic AST
Code Gen Output

Lexical Analysis vs. Syntax Analysis

Aspect Lexical Analysis Syntax Analysis
Phase First phase Second phase
Input Character stream Token stream
Output Tokens Parse tree / AST
Grammar Regular expressions Context-Free Grammar (CFG)
Errors Invalid characters, malformed tokens Structural errors, missing operators
Tool Finite Automaton Pushdown Automaton
Example 123NUM(123) NUM + NUM → valid expression

Context-Free Grammar Definition

The formal rules our parser uses to validate expressions

CFG for Arithmetic Expressions

A context-free grammar G = (V, Σ, P, S) where:

V (Variables): {Expr, Term, Factor, Primary}
Σ (Terminals): {NUMBER, ID, +, -, *, /, ^, (, ), unary-}
S (Start symbol): Expr

Production Rules (P):

1. Expr Expr + Term
2. Expr Expr Term
3. Expr Term
4. Term Term * Factor
5. Term Term / Factor
6. Term Factor
7. Factor Primary ^ Factor
8. Factor Primary
9. Primary NUMBER
10. Primary ID
11. Primary ( Expr )
12. Primary Primary

Operator Precedence

The grammar encodes precedence through its layered structure:

Highest Unary minus −x
High Exponentiation ^ (right-assoc)
Medium Multiply *, Divide /
Low Add +, Subtract

Grammar Properties

  • Unambiguous: Every valid string has exactly one parse tree
  • Left-recursive eliminated: Parser uses iterative loops to handle left-recursion
  • Right-recursive for ^: Exponentiation is right-associative as intended
  • Supports grouping: Parentheses override default precedence

Parser Implementation

Enter an expression to see parsing in action — live tokenization, step-by-step parsing, and tree visualization

Quick examples:

🔄 Recursive Descent Parser

A top-down parser that uses a set of mutually recursive procedures — one for each non-terminal in the grammar. It processes input from left to right and constructs a leftmost derivation.

✅ Advantages
  • Easy to implement and understand
  • Mirrors grammar structure directly
  • Excellent error reporting
  • Can handle complex semantic actions
⚠️ Limitations
  • Cannot handle left-recursive grammars directly
  • May require backtracking
  • Limited to LL grammars

📊 LL(1) Parser

A table-driven predictive parser that reads input Left-to-right, produces a Leftmost derivation, and uses 1 token of lookahead. It uses FIRST and FOLLOW sets to build a parsing table.

✅ Advantages
  • No recursion overhead
  • Predictable O(n) performance
  • Grammar-agnostic implementation
⚠️ Limitations
  • Requires grammar transformation
  • Cannot handle ambiguous grammars
  • Larger code for table construction

📚 LR(0) Parser

A bottom-up parser that reads input Left-to-right and constructs a Rightmost derivation in reverse. Uses shift-reduce operations with a state stack.

✅ Advantages
  • Handles a wider class of grammars
  • Can parse left-recursive grammars
  • Very powerful and general
⚠️ Limitations
  • Complex to implement manually
  • Harder to provide good error messages
  • Large parsing tables

Test Cases & Results

Comprehensive testing with valid and invalid inputs

0 passed | 0 failed

Applications of Syntax Analysis

Where parsing powers the tools we use every day

🛠️

Compilers & Interpreters

Every programming language compiler (GCC, Clang, javac) uses syntax analysis as a core phase. The parser converts source code into ASTs that enable optimization and code generation.

GCC LLVM V8 Engine
📝

Code Editors & IDEs

VS Code, IntelliJ, and other IDEs use incremental parsers for syntax highlighting, auto-completion, code folding, and real-time error detection as you type.

VS Code Tree-sitter LSP
🔎

Linters & Formatters

Tools like ESLint, Prettier, and Black parse source code to enforce coding standards, detect potential bugs, and automatically format code to consistent styles.

ESLint Prettier Black
🗄️

Database Query Engines

SQL engines parse queries to build query execution plans. The parser validates syntax, resolves table references, and enables query optimization.

PostgreSQL MySQL SQLite
🌐

Web Browsers

Browsers parse HTML, CSS, and JavaScript using specialized parsers. The HTML parser builds the DOM tree, CSS parser creates CSSOM, and JS parsers compile code to bytecode.

Blink Gecko WebKit
🤖

Natural Language Processing

NLP systems use parsers to analyze sentence structure, extract meaning, and understand relationships between words — powering chatbots, translators, and voice assistants.

spaCy NLTK Stanford NLP

Conclusion

Key takeaways and future directions

📋 Project Summary

This project demonstrated the design and implementation of a comprehensive syntax analysis detection system. We explored three major parsing techniques — Recursive Descent, LL(1), and LR(0) — all working with a well-defined context-free grammar for arithmetic expressions.

Key accomplishments include:

  • Defined a 12-rule CFG supporting arithmetic operations with proper operator precedence and associativity
  • Implemented three distinct parsing algorithms with step-by-step tracing
  • Built comprehensive syntax error detection with meaningful, positional error messages
  • Created interactive parse tree visualization using HTML5 Canvas
  • Validated the system with 15+ test cases covering valid and invalid inputs

🎓 Key Learnings

  • Grammar design matters: A well-structured grammar directly enables clean parser implementation and correct precedence handling
  • Top-down vs Bottom-up: Each approach has tradeoffs — recursive descent is intuitive but limited; LR parsing is powerful but complex
  • Error recovery is crucial: Real-world parsers need robust error recovery to provide useful feedback, not just fail on first error
  • Visualization aids understanding: Parse trees make abstract syntactic structure tangible and debuggable

🚀 Possible Improvements

  • Extend grammar to support assignment statements, conditionals, and loops (mini programming language)
  • Implement error recovery strategies (panic mode, phrase-level recovery)
  • Add semantic analysis phase (type checking, scope resolution)
  • Support LALR(1) and SLR(1) parsing algorithms
  • Generate intermediate code (three-address code) from the AST
  • Integrate with Lex/Yacc or ANTLR for comparison