Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Semantic Actions

The parol parser generator creates traits with functions that represent semantic actions. The generated parser then calls these functions at parse time at the appropriate points with correct arguments.

The generated trait for user actions (i.e. semantic actions) will be named after the following scheme:

#![allow(unused)]
fn main() {
pub trait <NameOfYourGrammar>GrammarTrait<'t> {
    // ...
}
}

For C#, the same concept is generated as an interface and a base class. The interface is named I<NameOfYourGrammar>Actions, and the base class is named <NameOfYourGrammar>Actions with overridable methods for each non-terminal.

The lifetime parameter <'t> can be left out if the types used don't hold references to the scanned text. This is automatically deduced.

Your grammar processing item implements this trait and can override the functions it needs.

It does not need to implement all trait functions, because all of them have default implementations.

All semantic actions are generated for non-terminals of your input grammar, and are typed accordingly.

The parol parser generator creates a trait with functions that represent semantic actions. Here, the semantic actions are typed and they are generated for the non-terminals of your input grammar instead of for productions of the expanded grammar.

You therefore do not have to work directly with ParseTreeType, although you still encounter items of type Token. Also, the expanded version of your grammar is much less relevant in day-to-day use.

parol's great merit is that it can automatically generate an adapter layer that provides conversion to typed grammar items. This layer follows simple, universal rules by generating the production-bound semantic actions accordingly.

This and the automatic AST type inference are the most outstanding properties of parol.

We will use the example calc for detailed explanations.

The file calc_grammar_trait.rs contains the generated traits and types we are interested in.

First we will have a look at the CalcGrammarTrait at the top of this file. For each non-terminal of the input grammar calc.par it contains exactly one semantic action.

#![allow(unused)]
fn main() {
/// Semantic actions trait generated for the user grammar
/// All functions have default implementations.
pub trait CalcGrammarTrait<'t> {
    /// Semantic action for non-terminal 'Calc'
    fn calc(&mut self, _arg: &Calc<'t>) -> Result<()> {
        Ok(())
    }
    // ...
}
}

The approach taken in this example is quite interesting. We only implement the semantic action for the start symbol of our grammar: Calc.

The implementation can be found in calc_grammar.rs.

Near the end you can find the one and only semantic action we implement here and thereby creating the functionality of a calculator language.

#![allow(unused)]
fn main() {
impl<'t> CalcGrammarTrait<'t> for CalcGrammar<'t> {
    /// Semantic action for non-terminal 'Calc'
    fn calc(&mut self, arg: &Calc<'t>) -> Result<()> {
        self.process_calc(arg)?;
        Ok(())
    }
}
}

But what is the advantage of implementing only the start symbol's semantic action? Since the start symbol is the root node of each and every concrete parse tree, we know that the generated type for it should comprise the complete input as the result of the parsing.

The key to this is the structure of the generated type Calc. It resembles the structure of all productions belonging to the non-terminal Calc. There is actually only one production for Calc:

Calc: { Instruction ";"^ };
#![allow(unused)]
fn main() {
///
/// Type derived for non-terminal Calc
///
pub struct Calc<'t> {
    pub calc_list: Vec<CalcList<'t>>,
}
}

The type Calc is basically a vector, which can be deduced from the repetition construct at the right-hand side of the production ({ Instruction ";"^ }).

The elements of the vector are of type CalcList that is defined this way:

#![allow(unused)]
fn main() {
///
/// Type derived for non-terminal calcList
///
pub struct CalcList<'t> {
    pub instruction: Instruction<'t>,
}
}

And in turn the type Instruction looks like this:

#![allow(unused)]
fn main() {
///
/// Type derived for non-terminal instruction
///
pub enum Instruction<'t> {
    Assignment(InstructionAssignment<'t>),
    LogicalOr(InstructionLogicalOr<'t>),
}
}

The latter one is an enum with two variants because the non-terminal Instruction has two productions:

// ---------------------------------------------------------
// INSTRUCTION
Instruction: Assignment;
Instruction: LogicalOr;

This concept is applied for all non-terminals of your grammar. Actually your grammar became typified.

This means eventually that any variable of type Calc can represent a validly parsed input sentence that belongs to the grammar defined by calc.par.

You then only have to evaluate the content of this value as done in this calculator example. I recommend studying this example more deeply; the approach will then become obvious.

As mentioned earlier the implementation can be found here: calc_grammar.rs.