NAME

IMCC - parsing

VERSION

0.1 intital
0.2 lexicals

OVERVIEW

This document describes the basic parsing functionality of imcc.

DESCRIPTION

Imcc parses and generates code in terms of compilation units. These are self contained blocks of code very similar to subroutines.

Code for a compilation unit is created as soon (or not earlier) as the end of the unit is reached.

General imcc syntax

program: statements ...

where a statement is a simple statement like if ... or a compilation unit containing statements. This allows e.g. nested subs.

Compilation units

Subroutines .sub ... .end

.sub _name
	statements
	...
.end

defines a subroutine with the entry point _name. Subroutine entry points (as all global labels) have to start with an underscore. The statements may contain valid PIR or PASM statements.

Assembly blocks .emit ... .eom

.emit
_sub1:
	pasm_statements
	...
	ret
...
.eom

defines a compilation unit containing PASM statements only. Typical usage is for language initialization and builtins code.

Code outside compilation units

stmt1
.sub _main
   stmt2
   ret
.end
stmt3

This generates the following PASM equivalent:

_main:
	stmt2
	ret

	stmt1
	stmt3

which is basically a sequence of unreachable code after the ret. To really use code outside compilation units, the first statement should have a global label.

_outside:
    stmt1
.sub _main
    stmt2
    call _outside
    ret
.end
    stmt3
    ret

This generates the following PASM equivalent:

_main:
	stmt2
	bsr _outside
	ret
_outside:
	stmt1
	stmt3
	ret

Nested subs

As code is produced as soon as a compilation unit is closed, the code for nested subroutines appears before the outer subroutine:

.sub _outer
    stmt1
    .sub _inner
	stmt2
	ret
    .end
    call _inner
    ret
.end

generates code like this:

_inner:
    stmt2
    ret
_ounter:
    stmt1
    bsr _inner
    ret

Symbols, constants and labels

Compilation units maintain their own symbol table containing local labels and variable symbols. This symbol table hash is not visible to code in different units.

Lexicals and named constants declared in an outer scope are visible and used, when not overridden by a .local or .const directive with the same name. S. t/syn/scope.t for examples for this.

Global labels and constants are kept in the global symbol table ghash, which is the symbol table of the outmost compilation unit.

This allows for global constant folding beyond subroutine scope.

Local labels in different compilation units with the same name are allowed, though running the generated PASM through assemble.pl doesn't work. Running this code inside imcc is ok. This will probably change so that local labels are mangled to be uniq.

FILES

imcc.y, instructions.c, t/syn/sub.t, t/imcpasm/sub.t, t/syn/scope.t

AUTHOR

Leopold Toetsch <lt@toetsch.at>