History of this project

The parol Parser Generator started as a personal journey to master LL(k) parsing with the concise means of deterministic finite automata.

Basic influence on its design had two parser generators which could not be more contrary in their approaches

But both of them have their own quirks and idiosyncrasies.

Bison tends to generate mysterious shift/reduce or reduce/reduce conflicts which can be sometimes hard to understand and ANTRL generates recursive descending parsers which are prone to stack overflows. It is easy to write (or generate) a program that crashes a parser generated by ANTLR.

On the other hand Bison generates deterministic parsers which are terse actually by using finite automata and ANTLR solves the problem of choosing the next production for a certain non-terminal by utilizing deterministic finite automata too.

So why not have the best of both worlds?

With this goal in mind I started my first attempts using F# as programming language (Lelek). But finally I stopped working on this project because it didn't feel 'right' anymore.

Anyhow, Lelek was a necessary step for me to become confident about what is feasible and what is not.

A lot of attempts followed and I made a shift to Rust which felt more vibrant and compelling to me.

And so parol was born - actually as a rewrite of Lelek. But I was willing to jettison some parts of Lelek and replace them with new approaches.

What I took over:

  • The basic approach of using regexes to generate scanners
  • Using DFAs to solve the Rule Decision Problem, although I changed the way to obtain the k-sets for productions
  • The basic ideas behind the structure of the grammar description language - and their resemblance to Bison's input format
  • The separation of language description and language implementation
  • The strategy to check a grammar first for some preconditions before trying to generate data for a parser to guarantee the termination of certain algorithms
  • The algorithm for visualizing parse trees

What I changed:

  • The part of recursion detection
  • The part of generating k-sets for productions (roughly all algorithms FIRST(k), FOLLOW(k))
  • The overall wording is hopefully more precise - e.g. I prefer 'Production' over 'Rule' now
  • The parser runtime was separated as a small crate

What I added:

  • Infer and generate all types of the grammar's AST, so your grammar description is sufficient for parol to build a completely functioning acceptor with no extra effort - this is real rapid prototyping for your language!
  • Built-in tools for
    • generating new crates
    • checking a grammar for certain properties (left-recursion, reachability, productivity)
    • left-factoring of a given grammar
    • calculating FIRST(k) and FOLLOW(k) sets
    • generating random sentences of a given grammar description
  • Scanner states, aka Start conditions
  • Build script integration to invoke parol automatically during the build of your own crate
  • An extension for Visual Studio Code and a Language Server
  • And all those features Lelek never received