Export Model Contract

This chapter defines the contract for external tools that consume parol export output.

Video Introduction

For a quick introduction on what parser model export can do for you, please watch the video Parser Model Export on YouTube.

Also consider to visit the The Parol Parser Generator YouTube channel. And if you don't want to miss new videos, please subscribe.

Goal

The export model is meant to provide language-agnostic grammar facts that are sufficient to build parser/scanner generators in other ecosystems.

The contract intentionally separates two concerns:

Export model: grammar-derived, algorithm-specific facts
External consumer config: generation policy and runtime behavior choices

Boundary: What belongs in export

The export payload contains data that is intrinsic to the grammar and parser construction:

parser algorithm kind (Llk or Lalr1)
ordered non-terminals and start symbol index
production list and RHS symbol structure
LL(k) lookahead automata or LALR(1) parse table
scanner terminals, states, comments, state-local skip token sets, transitions, and lookahead expressions
production datatype metadata derived from grammar type analysis

This data is considered portable across target languages.

Boundary: What does not belong in export

Configuration that is policy-driven or target-runtime specific should be provided by the external tool, not embedded as language-independent export metadata.

Typical examples:

enabling/disabling recovery in generated runtime wiring
parse tree omission or parse tree construction strategy
naming/style conventions in generated target-language code
file/project layout and scaffolding preferences
target-specific runtime integration flags

Keeping these settings outside the core export keeps the schema stable and avoids coupling to one runtime style.

Versioning and compatibility

The export schema version is carried in ParserExportModel.version.

Machine-readable JSON Schema for version = 2:

crates/parol/schemas/parser-export-model.v2.schema.json

Legacy schema (still published for compatibility with older consumers):

crates/parol/schemas/parser-export-model.v1.schema.json

Quick validation example

Using ajv-cli (Node.js), you can validate an exported JSON file against the schema:

npx ajv-cli validate \
	-s crates/parol/schemas/parser-export-model.v2.schema.json \
	-d crates/parol/tests/data/arg_tests/export_llk.expected.json

To validate a LALR export, replace the -d path with your export_lalr1 JSON (or any JSON generated by parol export).

Compatibility expectations for consumers:

reject unsupported future versions explicitly
accept known versions exactly
treat unknown top-level fields as non-breaking when possible
treat missing required fields for a known version as an error

Recommended consumer strategy:

Read version first.
Dispatch to a version-specific decoder.
Validate required sections for the selected algorithm.
Apply external consumer config after model decoding.

Minimum required sections per algorithm

Required for all versions currently supported by parol export:

version
algorithm
non_terminal_names
start_symbol_index
productions
scanner
production_datatypes

Additional required section by algorithm:

Llk: lookahead_automata required, lalr_parse_table must be null
Lalr1: lalr_parse_table required, lookahead_automata should be empty

Minimal JSON examples

The following examples are intentionally reduced to illustrate the contract shape, not full real-world content.

LL(k) export example

{
	"version": 1,
	"algorithm": "Llk",
	"non_terminal_names": ["Start"],
	"start_symbol_index": 0,
	"productions": [
		{
			"production_index": 0,
			"lhs_index": 0,
			"rhs": [],
			"text": "Start: ;"
		}
	],
	"lookahead_automata": [
		{
			"non_terminal_index": 0,
			"non_terminal_name": "Start",
			"prod0": 0,
			"k": 0,
			"transitions": []
		}
	],
	"lalr_parse_table": null,
	"scanner": {
		"terminals": [],
		"scanner_states": [
			{
				"scanner_state": 0,
				"scanner_name": "INITIAL",
				"line_comments": [],
				"block_comments": [],
				"auto_newline": true,
				"auto_ws": true,
				"allow_unmatched": false,
				"skip_tokens": [],
				"transitions": []
			}
		]
	},
	"production_datatypes": [
		{
			"production_index": 0,
			"non_terminal_name": "Start",
			"production_attribute": "None",
			"type_name": "StartEmpty",
			"type_kind": "Struct",
			"rust_type": "StartEmpty",
			"members": []
		}
	]
}

LALR(1) export example

{
	"version": 1,
	"algorithm": "Lalr1",
	"non_terminal_names": ["Start"],
	"start_symbol_index": 0,
	"productions": [
		{
			"production_index": 0,
			"lhs_index": 0,
			"rhs": [],
			"text": "Start: ;"
		}
	],
	"lookahead_automata": [],
	"lalr_parse_table": {
		"actions": ["Accept"],
		"states": [
			{
				"actions": [],
				"gotos": []
			}
		]
	},
	"scanner": {
		"terminals": [],
		"scanner_states": [
			{
				"scanner_state": 0,
				"scanner_name": "INITIAL",
				"line_comments": [],
				"block_comments": [],
				"auto_newline": true,
				"auto_ws": true,
				"allow_unmatched": false,
				"skip_tokens": [],
				"transitions": []
			}
		]
	},
	"production_datatypes": [
		{
			"production_index": 0,
			"non_terminal_name": "Start",
			"production_attribute": "None",
			"type_name": "StartEmpty",
			"type_kind": "Struct",
			"rust_type": "StartEmpty",
			"members": []
		}
	]
}

External config contract (consumer side)

Consumers should define their own explicit config input (CLI flags, file, or API params) for policy decisions. A minimal cross-language config typically includes:

recovery mode and limits
parse tree strategy
naming/casing policy
output layout and file naming
optional runtime integration switches

This config is intentionally separate from parser export JSON.

Example consumer config (outside export JSON):

{
	"recovery": {
		"enabled": true,
		"max_sync_tokens": 64
	},
	"parse_tree": {
		"mode": "omit"
	},
	"naming": {
		"type_case": "PascalCase",
		"member_case": "snake_case"
	},
	"output": {
		"directory": "./generated",
		"single_file": false
	}
}

Reproducibility note

If you need exact reproducibility of generated output, persist both:

parser export JSON
consumer config used during generation

Optionally record tool and schema provenance in your own metadata layer.

Keyboard shortcuts

The Parol Parser Generator