| [ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Procedures for generating parsers by formal grammar.
These procedures can generate a LALR(1) parser.
These procedures also can generate a tokenizer using regular expression.
In the parser which is generated by these procedures,
any procedures which return two values,
the first value is a symbol expressing a terminal symbol
and the second value is an semantic object,
can be a lexical analyser.
Regular expression which is available in these procedures is shown as follows.
Java standard regular expression can’t use in these procedures.
| rs | concatenate r and s |
| r|s | r or s |
| r* | repeat r more than 0 times |
| r+ | repeat r more than 1 time |
| r? | optional |
| (r) | r itself |
| . | any character except newline |
| [a-zA] | character set |
| [^a-zA] | complement character set |
creates a format grammar object which can use by procedure parse-grammar.
Syntax of grammar is shown as follows.
((left-value-of-grammar-rule (right-value-of-grammar-rule …) procedure-when-the-rule-is-reduced …)
Terminal symbols and nonterminal symbols must be Scheme symbols.
The start symbol must be the symbol upper case S.
"Procedure-when-the-rule-is-reduced" will be called
with the results of right-value-of-grammar-rule.
For example, when there is a rules shown as follows
(these are a rules defining usual arithmetic operation),
S → S + T S → S - T S → T T → T * F T → T / F T → F F → lparen S rparen F → num
definition of calculating obeying the grammar is shown as follows.
(make-grammar
`((S (S + T) ,(lambda (t1 x t2) (+ t1 t2)))
(S (S - T) ,(lambda (t1 x t2) (- t1 t2)))
(S (T) ,(lambda (t1) t1))
(T (T * F) ,(lambda (f1 x f2) (* f1 f2)))
(T (T / F) ,(lambda (f1 x f2) (* f1 f2)))
(T (F) ,(lambda (f1) f1))
(F (lparen S rparen)
,(lambda (x s1 y) s1))
(F (num)
,(lambda (v1) (string->number v1)))))
Symbols lparen, rparen and num will be given by lexical analyser.
If the rule has a conflicts,
this procedure doesn’t report any errors.
To get the conflicts you should use procedure get-conflicts.
parses objects with the given grammar and lexical analyser.
"Lexical analyser" is a procedure which returns two values,
the first value is a symbol expressing a terminal symbol
and the second value is an semantic object.
gets conflicts about the grammar rule. Syntax of the result is shown as follows.
(…
(shift-reduce terminal symbol to shift
terminal symbol to reduce) ;shift-reduce conflict
(reduce-reduce terminal symbol to reduce
terminal symbol to reduce) ;reduce-reduce conflict
…)
generates a tokenizer by regular expression. Syntax of lexer-patterns is shown as follows.
((string representing a regulae expression . termial symbol) …)
Example 1: a tokenizer which tokenize arithmetic operators and numbers.
(make-regexp-tokenize-pattern
'(("[0-9]+(\\.[0-9]+)?" . num
("[
]+" . whitespace) ; space+tab+newline
("\\(" . lparen
("\\)" . rparen
("\\+" . +
("-" . -
("\\*" . *
("\\/" . /))
creates a tokenizer. string is a string to be tokenized.
creates a tokenizer. iport is an input port to be tokenized.
tokenizes the input and returns a termial symbol and matched string.
| [ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] |
This document was generated on August 9, 2012 using texi2html 5.0.