***** WARNING *****
This is an ALPHA pod! The contents herein may not reflect reality. :)
NAME
docs/optable.pod - PGE operator precedence table and parser
VERSION
$Revision: 20630 $
DESCRIPTION
PGE::OPTable is the bottom up shift/reduce style parser component of the Parrot Grammar Engine (PGE) suite. PGE is a Parrot implementation of Perl6 rules.
FUTURE CONSIDERATIONS
- - Shift reduce application to more general grammar productions than just operators.
- - Static state machine transition table generation. (Optimization)
- - tighter and looser should work even when their argument operator hasn't been defined yet.
DEFINITIONS
"operator"
An operator is a most often mathematical function usually taking one or two arguments (operands, also called terms) and returning a calculated result. Operators are often characterized as pure (no side effects). Obvious exceptions to this rule are increment (++) and decrement (--) operators and assignment operators such as +=, *=, etc.
Operators which have one operand are called unary operators. Unary operator symbols may appear in the prefix position (in front of the term) or in the postfix position (following the term). Binary operators have two operators. Binary operator symbols usually appear in the infix position between the two operands. Ternary operators also exist, such as the C style ternary conditional operator expression ? true case : false case
. See Synopsis 5 Rules for more information.
"expression"
An expression is a combination of operators, operands (terms such as variables and values) and grouping symbols that describe a computation. Expressions return a result.
"term"
A term is the atomic unit on which an operator operates. Operand is the more formal mathematical term for term. :) In OPTable parsed expressions a term is a variable or primitive value.
"precedence"
Precedence is the order in which operators are evaluated. Higher precedence operators are evaluated before lower precedence operators.
"precedence level"
Computer languages usually have a table of operator precedences. Operators at the top of the table have higher precedence than those below. Operators at the same vertical level in the table have equivalent or equal precedence. An operator's level in the table is called its precedence level. OPTable uses integers to signify precedence. The greater the OPTable precedence integer the higher the precedence level of the operator. An operator with precedence level 22 has higher precedence that an operator with precedence level 5 and will be evaluated first.
"shift/reduce"
TODO
SYNTAX
grammar NAMESPACE;
proto OPERATOR_NAME ADVERBIAL_CLAUSES* { ... }
The grammar
statement at the top is the namespace in which the optable will be generated. Written in Perl6 style the grammar
statement is translated by PGE into valid Parrot namespace syntax.
The proto
statement declares an operator to be added to the precedence table and parser. All operator attributes are defined using the Perl6 adverbial style of is ADVERB()
. Adverbial clauses are separated by white space. Adverbs can have 0-arity in which case they can be written without parentheses. Adverbial arguments are written comma separated in parentheses.
"Adverbial Precedence Clauses"
- - is precedence(PRECEDENCE_LEVEL_STRING)
-
is precedence
takes a single string argument which contains the precedence level for the current operator thatis precedence
is modifying. Theis precedence
argument string is formatted as a integer precedence level followed by equals sign. e.g. '22=' - - is tighter(PREVIOUSLY_DEFINED_OPERATOR_NAME)
-
is tighter
takes a previously defined operator name as its single argument. The argument operator's precedence level plus 1 is then used as the precedence level for the current operator thatis tighter
is modifying. - - is looser(PREVIOUSLY_DEFINED_OPERATOR_NAME)
-
is looser
takes a previously defined operator name as its single argument. The argument operator's precedence level minus 1 is then used as the precedence level for the current operator thatis looser
is modifying. - - is equiv(PREVIOUSLY_DEFINED_OPERATOR_NAME)
-
is equiv
takes a previously defined operator name as its single argument. The argument operator's precedence level is used as the precedence level for the current operator thatis equiv
is modifying. - - is assoc(DESCRIPTION)
-
DESCRIPTION can be one of 'list', 'left', 'right', 'non', or 'chain'.
is assoc
declares the associativity of the operator. The absence of ais assoc
adverb indicates that the operation is associative or that the order of evaluation of two or more instances of an operator in an expression is unimportant.'left'
signifies left association; evaluation should occur from the left. Conversely,'right'
signifies right association; evaluation should occur from the right.'non'
declares that this operator doesn't strongly associate to the left or right.'list'
specifies that the operator is associated as a list context.'chain'
declares chained association such asa = b = c = 10
ora < 10 < b
.
"Adverbial Clauses"
- - is parsed()
-
PGE::OPTable normally generates the parsing code for an operator based on the operator name which usually consists of the operator's orientation followed by a colon and then the operator symbol. In Perl6 'infix:*' is an example operator name of the infix multiplication operator. 'infix:x' likewise represents the Perl6 repeat operator. The
is parsed
adverb declares that this particular operator is parsed using the Perl6 match conforming method specified as the adverbs argument instead of auto-generated code based off of the operator's name. - - is pastrule()
-
The
is pastrule
adverb defines the pastrule attribute of an operator. During later processing by the Tree Grammar Engine (TGE) compiler tool, the pastrule attribute can be used to specify custom TGE processing.TODO: needs concrete example
- - is post()
-
The
is post
adverb specifies the Parrot opcode that implements the semantics of this particular operator.is post('add')
is used to annotate the 'infix:+' operator in languages where the infix + symbol denotes addition. - - is expect()
-
The
is expect
adverb used to specify the hexadecimal identifier of the next token OPTable should expect??? - - is returns()
-
The
is returns
adverb specifies the type of the result for this operator. TGE can use this attribute to construct correctly typed temporary to hold the intermediate results of operations for later use or combination. - - is pir()
-
The
is pir
adverb specifies a code generation emit string that can be used during code generation. Parrot assembly is emitted as Parrot Intermediate Representation (PIR). Hence the adverb name pir. The argument string is a format string, after the C style printf format string.%r
represents the result of the operation.%0
,%1
, etc represent the operands of the operator. proto 'infix:~&' is equiv('infix:*') is pir(" %r = bands %0, %1") { ... } - - is nullterm
-
The
is nullterm
adverb specifies that the operator is an 0-arity function that doesn't take any operands. - - is stop() TODO: needs help, stolen shamelessly from compilers/pge/PGE/OPTable.pir
-
The
is stop
adverb declares a string to be matched directly or a sub(rule) to be called to check for a match.
IMPLEMENTATION
TODO: how it works inside :)
LANGUAGE NOTES
None.
EXAMPLES
languages/perl6/src/grammar_optok.pg
languages/cardinal/src/cardinal_optok.pg
ATTACHMENTS
None.
FOOTNOTES
None.
REFERENCES
http://en.wikipedia.org/wiki/Order_of_operations
http://en.wikipedia.org/wiki/Associativity
http://en.wikipedia.org/wiki/Bottom-up_parsing
http://www.ozonehouse.com/mark/blog/code/PeriodicTable.html
http://dev.perl.org/perl6/doc/design/syn/S03.html
compilers/tge
compilers/past
languages/perl6/src/PAST.pir
languages/perl6/src/POST.pir