***** WARNING *****

This is an ALPHA pod! The contents herein may not reflect reality. :)

NAME

docs/optable.pod - PGE operator precedence table and parser

VERSION

$Revision: 19683 $

DESCRIPTION

PGE::OPTable is the bottom up shift/reduce style parser component of the Parrot Grammar Engine (PGE) suite. PGE is a Parrot implementation of Perl6 rules.

FUTURE CONSIDERATIONS

- Shift reduce application to more general grammar productions than just operators.
- Static state machine transition table generation. (Optimization)
- tighter and looser should work even when their argument operator hasn't been defined yet.

DEFINITIONS

"operator"

An operator is a most often mathematical function usually taking one or two arguments(operands, also called terms) and returning a calculated result. Operators are often characterized as ( pure, no side effects ). Obvious exceptions to this rule are increment(++) and decrement(--) operators and assignment operators such as +=, *=, etc.

Operators which have one operand are called unary operators. Unary operator symbols may appear in the prefix position (in front of the term) or in the postfix position(following the term. Binary operators have two operators. Binary operator symbols usually appear in the infix position between the two operands. Ternary operators also exist, such as the C style ternary conditional operator expression ? true case : false case. See Synopsis 5 Rules for more information.

"expression"

A expression is a combination of operators, operands(terms such as variables and values) and grouping symbols that describe a computation. Expressions return a result.

"term"

A term is the atomic unit which a operator operates on. Operand is the more formal mathematical term for term. :) In OPTable parsed expressions a term is a variable or primitive value.

"precedence"

Precedence is the order in which operators are evaluated. Higher precedence operators are evaluated before lower precedence operators.

"precedence level"

Computer languages usually have a table of operator precedences. Operators at the top of the table have higher precedence than those below. Operators at the same vertical level in the table have equivalent or equal precedence. An operators level in the table is called its precedence level. OPTable uses integers to signify precedence. The greater the OPTable precedence integer the higher the precedence level of the operator. An operator with precedence level 22 has higher precedence that an operator with precedence level 5 and will be evaluated first.

"shift/reduce"

TODO

SYNTAX

grammar NAMESPACE;

proto OPERATOR_NAME ADVERBIAL_CLAUSES* { ... }

The grammar statement at the top of is the namespace in which the optable will be generated. Written in Perl6 style the grammar statement is translated by PGE into valid Parrot namespace syntax.

The proto statement declares a operator to be added to the precedence table and parser. All operator attributes are defined using the Perl6 adverbial style of is ADVERB(). Adverbial clauses are separated by white space. Adverbs can have 0-arity in which case they can be written without parenthesis. Adverbial arguments are written comma separated in parenthesis.

"Adverbial Precedence Clauses"

- is precedence(PRECEDENCE_LEVEL_STRING)

is precedence takes a single string argument which contains the precedence level for the current operator that is precedence is modifying. The is precedence argument string is formatted as a integer precedence level followed by equals sign. e.g. '22='

- is tighter(PREVIOUSLY_DEFINED_OPERATOR_NAME)

is tighter takes a previously defined operator name as its single argument. The argument operator's precedence level plus 1 is then used as the precedence level for the current operator that is tighter is modifying.

- is looser(PREVIOUSLY_DEFINED_OPERATOR_NAME)

is looser takes a previously defined operator name as its single argument. The argument operator's precedence level minus 1 is then used as the precedence level for the current operator that is looser is modifying.

- is equiv(PREVIOUSLY_DEFINED_OPERATOR_NAME)

is equiv takes a previously defined operator name as its single argument. The argument operator's precedence level is used as the precedence level for the current operator that is equiv is modifying.

- is assoc(DESCRIPTION)

DESCRIPTION can be one of 'list', 'left', 'right', 'non', or 'chain'.

is assoc declares the associativity of the operator. The absence of a is assoc adverb indicates that the operation is associative or that the order of evaluation of two or more instances of an operator in an expression is unimportant. 'left' signifies left association; evaluation should occur from the left. Conversely, 'right' signifies right association; evaluation should occur from the right. 'non' declares that this operator doesn't strongly associate to the left or right. 'list' specifies that the operator is associated as a list context. 'chain' declares chained association such as a = b = c = 10 or a < 10 < b.

"Adverbial Clauses"

- is parsed()

PGE::OPTable normally generates the parsing code for an operator based on the operator name which usually consists of the operators orientation followed by a colon and then the operator symbol. In Perl6 'infix:*' is an example operator name of the infix multiplication operator. 'infix:x' likewise represents the Perl6 repeat operator. The is parsed adverb declares that this particular operator is parsed using the Perl6 match conforming method specified as the adverbs argument instead of auto-generated code based off of the operators name.

- is pastrule()

The is pastrule adverb defines the pastrule attribute of a operator. During later processing by the Tree Grammar Engine(TGE) compiler tool, the pastrule attribute can be used to specify custom TGE processing.

TODO: needs concrete example

- is post()

The is post adverb specifies the Parrot opcode that implements the semantics of this particular operator. is post('add') is used to annotate the 'infix:+' operator in languages where the infix + symbol denotes addition.

- is expect()

The is expect adverb used to specify the hexadecimal identifier of the next token OPTable should expect???

- is returns()

The is returns adverb specifies the type of the result for this operator. TGE can use this attribute to construct correctly typed temporary to hold the intermediate results of operations for later use or combination.

- is pir()

The is pir adverb specifies a code generation emit string that can be used during code generation. Parrot assembly is emitted as Parrot Intermediate Representation(PIR). Hence the adverb name pir. The argument string is a format string, after the C style printf format string. %r represents the result of the operation. %0, %1, etc represent the operands of the operator. proto 'infix:~&' is equiv('infix:*') is pir(" %r = bands %0, %1") { ... }

- is nullterm

The is nullterm adverb specifies that the operator is an 0-arity function that doesn't take any operands.

- is stop() TODO: needs help, stolen shamelessly from compilers/pge/PGE/OPTable.pir

The is stop adverb declares a string to be matched directly or a sub(rule) to be called to check for a match.

IMPLEMENTATION

TODO: how it works inside :)

LANGUAGE NOTES

None.

EXAMPLES

languages/perl6/src/grammar_optok.pg
languages/cardinal/src/cardinal_optok.pg

ATTACHMENTS

None.

FOOTNOTES

None.

REFERENCES

http://en.wikipedia.org/wiki/Order_of_operations
http://en.wikipedia.org/wiki/Associativity
http://en.wikipedia.org/wiki/Bottom-up_parsing
http://www.ozonehouse.com/mark/blog/code/PeriodicTable.html
http://dev.perl.org/perl6/doc/design/syn/S03.html

compilers/tge
compilers/past
languages/perl6/src/PAST.pir
languages/perl6/src/POST.pir