NAME
docs/pdds/pdd19_pir.pod - Parrot Intermediate Representation
ABSTRACT
This document outlines the architecture and core syntax of the Parrot Intermediate Representation (PIR).
This document describes PIR, a stable, middle-level language for both compiler and human to target on.
VERSION
$Revision: 26734 $
DESCRIPTION
PIR is a stable, middle-level language intended both as a target for the generated output from high-level language compilers, and for human use developing core features and extensions for Parrot.
Basic Syntax
A valid PIR program consists of a sequence of statements, directives, comments and empty lines.
Statements
A statement starts with an optional label, contains an instruction, and is terminated by a newline (<NL>). Each statement must be on its own line.
[label:] [instruction] <NL>
An instruction may be either a low-level opcode or a higher-level PIR operation, such as a subroutine call, a method call, a directive, or PIR syntactic sugar.
Directives
A directive provides information for the PIR compiler that is outside the normal flow of executable statements. Directives are all prefixed with a ".", as in .local
or .sub
.
Comments
Comments start with #
and last until the following newline. PIR also allows comments in Pod format. Comments, Pod content, and empty lines are ignored.
Identifiers
Identifiers start with a letter or underscore, then may contain additionally letters, digits, and underscores. Identifiers don't have any limit on length at the moment, but some sane-but-generous length limit may be imposed in the future (256 chars, 1024 chars?). The following examples are all valid identifiers.
a
_a
A42
Opcode names are not reserved words in PIR, and may be used as variable names. For example, you can define a local variable named print
. [See RT #24251]
{{ NOTE: The use of ::
in identifiers is deprecated. [See RT #48735] }}
Labels
A label declaration consists of a label name followed by a colon. A label name conforms to the standard requirements for identifiers. A label declaration may occur at the start of a statement, or stand alone on a line, but always within a compilation unit.
A reference to a label consists of only the label name, and is generally used as an argument to an instruction or directive.
A PIR label is accessible only in the compilation unit where it's defined. A label name must be unique within a compilation unit, but it can be reused in other compilation units.
goto label1
...
label1:
Registers and Variables
There are three ways of referencing Parrot's registers. The first is direct access to a specific register by name In, Sn, Nn, Pn. The second is through a temporary register variable $In, $Sn, $Nn, $Pn. n consists of digit(s) only. There is no limit on the size of n.
The third syntax for accessing registers is through named local variables declared with .local
.
.local pmc foo
The type of a named variable can be int
, num
, string
or pmc
, corresponding to the types of registers. No other types are used. [See RT#42769]
The difference between direct register access and register variables or local variables is largely a matter of allocation. If you directly reference P99
, Parrot will blindly allocate 100 registers for that compilation unit. If you reference $P99
or a named variable foo
, on the other hand, Parrot will intelligently allocate a literal register in the background. So, $P99
may be stored in P0
, if it is the only register in the compilation unit.
Constants
Constants may be used in place of registers or variables. A constant is not allowed on the left side of an assignment, or in any other context where the variable would be modified.
- 'single-quoted string constant'
-
Are delimited by single-quotes (
'
). They are taken to be ASCII encoded. No escape sequences are processed. - "double-quoted string constants"
-
Are delimited by double-quotes (
"
). A"
inside a string must be escaped by\
. Only 7-bit ASCII is accepted in string constants; to use characters outside that range, specify an encoding in the way below. - <<"heredoc", <<'heredoc'
-
Heredocs work like single or double quoted strings. All lines up to the terminating delimiter are slurped into the string. The delimiter has to be on its own line, at the beginning of the line and with no trailing whitespace.
Assignment of a heredoc:
$S0 = <<"EOS" ... EOS
A heredoc as an argument:
function(<<"END_OF_HERE", arg) ... END_OF_HERE .return(<<'EOS') ... EOS .yield(<<'EOS') ... EOS
You may have multiple heredocs within a single statement or directive:
function(<<'INPUT', <<'OUTPUT', 'some test') ... INPUT ... OUTPUT
- charset:"string constant"
-
Like above with a character set attached to the string. Valid character sets are currently:
ascii
(the default),binary
,unicode
(with UTF-8 as the default encoding), andiso-8859-1
.
String escape sequences
Inside double-quoted strings the following escape sequences are processed.
\xhh 1..2 hex digits
\ooo 1..3 oct digits
\cX control char X
\x{h..h} 1..8 hex digits
\uhhhh 4 hex digits
\Uhhhhhhhh 8 hex digits
\a, \b, \t, \n, \v, \f, \r, \e, \\
- encoding:charset:"string constant"
-
Like above with an extra encoding attached to the string. For example:
set S0, utf8:unicode:"«"
The encoding and charset gets attached to the string, no further processing is done, specifically escape sequences are not honored.
- numeric constants
-
0x
and0b
denote hex and binary constants respectively.
Directives
- .local <type> <identifier> [:unique_reg]
-
Define a local name identifier for this compilation unit with the given type. You can define multiple identifiers of the same type by separating them with commas:
.local int i, j
The optional
:unique_reg
modifier will force the register allocator to associate the identifier with a unique register for the duration of the compilation unit. - .lex <string constant>, <reg>
-
Declare a lexical variable that is an alias for a PMC register. For example, given this preamble:
.lex "$a", $P0 $P1 = new 'Integer' These two opcodes have an identical effect: $P0 = $P1 store_lex "$a", $P1 And these two opcodes also have an identical effect: $P1 = $P0 $P1 = find_lex "$a"
- .const <type> <identifier> = <const>
-
{{ PROPOSAL: add .const <string constant> <identifier> = <const> as an alternative to allow ".const 'Sub' ... " }}
Define a constant named identifier of type type and assign value const to it. The constant is stored in the constant table of the current bytecode file.
- .globalconst <type> <identifier> = <const>
-
As
.const
above, but the defined constant is globally accessible. - .namespace <identifier> [deprecated: See RT #48737]
-
Open a new scope block. This "namespace" is not the same as the .namespace [ <identifier> ] syntax, which is used for storing subroutines in a particular namespace in the global symbol table. This directive is useful in cases such as (pseudocode):
local x = 1; print(x); # prints 1 do # open a new namespace/scope block local x = 2; # this x hides the previous x print(x); # prints 2 end # close the current namespace print(x); # prints 1 again
All types of common language constructs such as if, for, while, repeat and such that have nested scopes, can use this directive.
{{ NOTE: this variation of
.namespace
and.endnamespace
are deprecated. They were a hackish attempt at implementing scopes in Parrot, but didn't actually turn out to be useful.}} - .endnamespace <identifier> [deprecated: See RT #48737]
-
Closes the scope block that was opened with .namespace <identifier>.
- .namespace [ <identifier> ; <identifier> ]
-
Defines the namespace from this point onwards. By default the program is not in any namespace. If you specify more than one, separated by semicolons, it creates nested namespaces, by storing the inner namespace object in the outer namespace's global pad.
{{ PROPOSAL: make the brackets non-optional for specifying the "root" namespace, so the key becomes optional.
.namespace [ <key>? ] key: <identifier> [';' <identifier>]* Also, the "identifier" should be a quoted string? }}
- .pragma n_operators
-
Convert arithmethic infix operators to n_infix operations. The unary opcodes
abs
,not
,bnot
,bnots
, andneg
are also changed to use an_
prefix..pragma n_operators 1 .sub foo ... $P0 = $P1 + $P2 # n_add $P0, $P1, $P2 $P2 = abs $P0 # n_abs $P2, $P0
- .loadlib "lib_name"
-
Load the given library at compile time, that is, as soon that line is parsed. See also the
loadlib
opcode, which does the same at run time.A library loaded this way is also available at runtime, as if it has been loaded again in
:load
, so there is no need to callloadlib
at runtime. - .HLL <hll_name>, <hll_lib>
-
Define the HLL for the current file. Takes two string constants. If the string hll_lib isn't empty this compile time pragma also loads the shared lib for the HLL, so that integer type constants are working for creating new PMCs.
{{ PROPOSAL: make the ",<hll_lib>" part optional, so you don't have to specify an empty string for the library. (Alternatively, make this two different directives: .HLL_name, .HLL_lib) }}
- .HLL_map <core_type>, <user_type>
-
{{ PROPOSAL: make the ',' an "->", "=>", "=", for instance, so it's easier to remember what argument comes first, the core type or the user type. }}
Whenever Parrot has to create PMCs inside C code on behalf of the running user program it consults the current type mapping for the executing HLL and creates a PMC of type user_type instead of core_type, if such a mapping is defined. core_type and user_type may be any valid string constant.
For example, with this code snippet ...
.loadlib 'dynlexpad' .HLL "Foo", "" .HLL_map 'LexPad', 'DynLexPad' .sub main :main ...
... all subroutines for language Foo would use a dynamic lexpad pmc.
{{ PROPOSAL: stop using integer constants for types RT#45453 }}
- .sub
-
.sub <identifier> [:<flag> ...] .sub <quoted string> [:<flag> ...]
Define a compilation unit. All code in a PIR source file must be defined in a compilation unit. See the section
Subroutine flags
for available flags. Optional flags are a list of flag, separated by empty spaces.The name of the sub may be either a bare identifier or a quoted string constant. Bare identifiers must be valid PIR identifiers (see Identifiers above), but string sub names can contain any characters, including characters from different character sets (see Constants above).
Always paired with
.end
. - .end
-
End a compilation unit. Always paired with
.sub
. - .line <integer>, <string>
-
Set the line number and filename to the value specified. This is useful in case the PIR code is generated from some source file, and any error messages should print the source file, not the line number and filename of the generated file.
{{ DEPRECATION NOTE: was
<#line <integer
<string>>>. See [RT#45857], [RT#43269], and [RT#47141]. }}
Subroutine flags
- :main
-
Define "main" entry point to start execution. If multiple subroutines are marked as :main, the last marked subroutine is entered.
- :load
-
Run this subroutine during the load_bytecode opcode. If multiple subs have the :load pragma, the subs are run in source code order.
- :init
-
Run the subroutine when the program is run directly (that is, not loaded as a module). This is different from :load, which runs a subroutine when a library is being loaded. To get both behaviours, use :init :load.
- :anon
-
Do not install this subroutine in the namespace. Allows the subroutine name to be reused.
- :multi(Type1, Type2...)
-
Engage in multiple dispatch with the listed types. See "pdds/pdd27_multi_dispatch.pod" in docs for more information on the multiple dispatch system.
- :immediate
-
This subroutine is executed immediately after being compiled. (Analagous to
BEGIN
in perl5.) - :postcomp
-
Same as
:immediate
, except that the subroutine is not executed when the compilation of the file that contains the subroutine is triggered by aload_bytecode
instruction in another file.An example. File
main.pir
contains:.sub main load_bytecode "foo.pir" .end
The file
foo.pir
contains:.sub foo :immediate print "42" .end .sub bar :postcomp print "43" .end
When executing file
foo.pir
, it will execute bothfoo
andbar
. However, when executing the filemain.pir
, onlyfoo
will be executed. - :method
-
The marked
.sub
is a method. In the method body, the object PMC can be referred to withself
. - :vtable
-
The marked
.sub
overrides a v-table method. By default, a sub with the same name as a v-table method does not override the v-table method. To specify that there should be no namespace entry (that is, it just overrides the v-table method but is callable as a normal method), use :vtable :anon. To give the v-table method a different name, use :vtable("..."). For example, to have the method ToString also be the v-table method get_string), use :vtable("get_string"). - :outer(subname)
-
The marked
.sub
is lexically nested within the sub known by subname.
Directives used for Parrot calling conventions.
{{ A bit of a radical idea, but now would be the time to decide on this: Remove the whole "long-style" invocation syntax altogether. Only allow the short version. As PIR is typically being generated, and hopefully by PCT-based compilers, there seems to be no real use for too much syntactic sugar. Just a thought. }}
- .begin_call and .end_call
-
Directives to start and end a subroutine invocation, respectively.
- .begin_return and .end_return
-
Directives to start and end a statement to return values.
- .begin_yield and .end_yield
-
Directives to start and end a statement to yield values.
- .call
-
Takes either 2 arguments: the sub and the return continuation, or the sub only. For the latter case an invokecc gets emitted. Providing an explicit return continuation is more efficient, if its created outside of a loop and the call is done inside a loop.
- .invocant
-
Directive to specify the object for a method call. Use it in combination with
.meth_call
. - .meth_call
-
Directive to do a method call. It calls the specified method on the object that was specified with the
.invocant
directive. - .nci_call
-
Directive to make a call through the Native Calling Interface (NCI). The specified subroutine must be loaded using the <dlfunc> op that takes the library, function name and function signature as arguments. See "pdds/pdd16_native_call" in docs for details.
- .return <var> [:<flag>]*
-
Between
.begin_return
and.end_return
, specify one or more of the return value(s) of the current subroutine. Available flags::flat
,:named
. - .arg <var> [:<flag>]*
-
Between
.begin_call
and.call
, specify an argument to be passed. Available flags::flat
,:named
. - .result <var> [:<flag>]*
-
Between
.call
and.end_call
, specify where one or more return value(s) should be stored. Available flags::slurpy
,:named
,:optional
, and:opt_flag
.
Directives for subroutine parameters
- .param <type> <identifier> [:<flag>]*
-
At the top of a subroutine, declare a local variable, in the manner of
.local
, into which parameter(s) of the current subroutine should be stored. Available flags::slurpy
,:named
,:optional
,:opt_flag
and:unique_reg
. - .param <type> "<identifier>" => <identifier> [:<flag>]*
-
Define a named parameter. This is syntactic sugar for:
.param <type> <identifier> :named("<identifier>")
Parameter Passing and Getting Flags
See PDD03 for a description of the meaning of the flag bits SLURPY
, OPTIONAL
, OPT_FLAG
, and FLAT
, which correspond to the calling convention flags :slurpy
, :optional
, :opt_flag
, and :flat
.
Catching Exceptions
Using the push_eh
op you can install an exception handler. If an exception is thrown, Parrot will execute the installed exception handler. In order to retrieve the thrown exception, use the .get_results
directive. This directive always takes 2 arguments: an exception object and a message string.
{{ Wouldn't it be more useful to make this flexible, or at least only the exception object? The message can be retrieved from the exception object. }}
push_eh handler
...
handler:
.local pmc exception
.local string message
.get_results (exception, message)
...
This is syntactic sugar for the get_results
op, but any flags set on the targets will be handled automatically by the PIR compiler. The .get_results
directive must be the first instruction of the exception handler; only declarations (.lex, .local) may come first.
Syntactic Sugar
Any PASM opcode is a valid PIR instruction. In addition, PIR defines some syntactic shortcuts. These are provided for ease of use by humans producing and maintaing PIR code.
- goto <identifier>
-
branch
to identifier (label or subroutine name).Examples:
goto END
- if <var> goto <identifier>
-
If var evaluates as true, jump to the named identifier. Translate to
if var, identifier
. - unless <var> goto <identifier>
-
Unless var evaluates as true, jump to the named identifier. Translate to
unless var, identifier
. - if null <var> goto <identifier>
-
If var evaluates as null, jump to the named identifier. Translate to
if_null var, identifier
. - unless null <var> goto <identifier>
-
Unless var evaluates as null, jump to the named identifier. Translate to
unless_null var, identifier
. - if <var1> <relop> <var2> goto <identifier>
-
The relop can be:
<, <=, ==, != >= >
which translate to the PASM opcodeslt
,le
,eq
,ne
,ge
orgt
. If var1 relop var2 evaluates as true, jump to the named identifier. - unless <var1> <relop> <var2> goto <identifier>
-
The relop can be:
<, <=, ==, != >= >
which translate to the PASM opcodeslt
,le
,eq
,ne
,ge
orgt
. Unless var1 relop var2 evaluates as true, jump to the named identifier. - <var1> = <var2>
-
Assign a value. Translates to
set var1, var2
. - <var1> = <unary> <var2>
-
The unaries
!
,-
and~
generatenot
,neg
andbnot
ops. - <var1> = <var2> <binary> <var3>
-
The binaries
+
,-
,*
,/
,%
and**
generateadd
,sub
,mul
,div
,mod
andpow
arithmetic ops. binary.
isconcat
and only valid for string arguments.<<
and>>
are arithmetic shiftsshl
andshr
.>>>
is the logical shiftlsr
.&&
,||
and~~
are logicand
,or
andxor
.&
,|
and~
are binaryband
,bor
andbxor
.{{PROPOSAL: Change description to support logic operators (comparisons) as implemented (and working) in imcc.y.}}
- <var1> <op>= <var2>
-
This is equivalent to
<var1> = <var1> <op> <var2>
. Where op is called an assignment operator and can be any of the following binary operators described earlier:+
,-
,*
,/
,%
,.
,&
,|
,~
,<<
,>>
or>>>
. - <var> = <var> [ <var> ]
-
This generates either a keyed
set
operation orsubstr var, var, var, 1
for string arguments and an integer key. - <var> = <var> [ <key> ]
-
{{ NOTE: keyed assignment is still valid in PIR, but the
..
notation in keys is deprecated [See RT #48561], so this syntactic sugar for slices is also deprecated. See the (currently experimental)slice
opcode instead. }}where
key
is:<var1> .. <var2>
returns a slice defined starting at
var1
and ending atvar2
... <var2>
returns a slice starting at the first element, and ending at
var2
.<var1> ..
returns a slice starting at
var1
to the end of the array.see src/pmc/slice.pmc and t/pmc/slice.t.
- <var> [ <var> ] = <var>
-
A keyed
set
operation.{{ DEPRECATION NOTE: this syntactic sugar will no longer be used for the assign
substr
op with a length of 1. }} - <var> = <opcode> <arguments>
-
All opcodes can use this PIR syntactic sugar. The first argument for the opcode is placed before the
=
, and all remaining arguments go after the opcode name. For example:new $P0, 'Type'
becomes:
$P0 = new 'Type'
- global "string" = <var>
-
{{ DEPRECATED: op store_global was deprecated }}
- <var> = global "string"
-
{{ DEPRECATED: op find_global was deprecated }}
- ([<var1> [:<flag1> ...], ...]) = <var2>([<arg1> [:<flag2> ...], ...])
-
This is short for:
.begin_call .arg <arg1> <flag2> ... .call <var2> .result <var1> <flag1> ... .end_call
- <var> = <var>([arg [:<flag> ...], ...])
- <var>([arg [:<flag> ...], ...])
- <var>."_method"([arg [:<flag> ...], ...])
- <var>._method([arg [:<flag> ...], ...])
-
Function or method call. These notations are shorthand for a longer PCC function call. var can denote a global subroutine, a local identifier or a reg.
{{We should review the (currently inconsistent) specification of the method name. Currently it can be a bare word, a quoted string or a string register. See #45859.}}
- .return ([<var> [:<flag> ...], ...])
-
Return from the current compilation unit with zero or more values.
The surrounded parentheses are mandatory. Besides making sequence break more conspicuous, this is necessary to distinguish this syntax from other uses of the
.return
directive that will be probably deprecated. - .return <var>(args)
- .return <var>."somemethod"(args)
- .return <var>.somemethod(args)
-
Tail call: call a function or method and return from the sub with the function or method call return values.
Internally, the call stack doesn't increase because of a tail call, so you can write recursive functions and not have stack overflows.
Assignment and Morphing
The =
syntactic sugar in PIR, when used in the simple case of:
<var1> = <var2>
directly corresponds to the set
opcode. So, two low-level arguments (int, num, or string registers, variables, or constants) are a direct C assignment, or a C-level conversion (int cast, float cast, a string copy, or a call to one of the conversion functions like string_to_num
).
A PMC source with a low-level destination, calls the get_integer
, get_number
, or get_string
vtable function on the PMC. A low-level source with a PMC destination calls the set_integer_native
, set_number_native
, or set_string_native
vtable function on the PMC (assign to value semantics). Two PMC arguments are a direct C assignment (assign to container semantics).
For assign to value semantics for two PMC arguments use assign
, which calls the assign_pmc
vtable function.
{{ NOTE: response to the question:
<pmichaud> I don't think that 'morph' as a method call is a good idea
<pmichaud> we need something that says "assign to value" versus "assign to container"
<pmichaud> we can't eliminate the existing 'morph' opcode until we have a replacement
}}
Macros
This section describes the macro layer of the PIR language. The macro layer of the PIR compiler handles the following directives:
.include
"<filename>"The
.include
directive takes a string argument that contains the name of the PIR file that is included. The contents of the included file are inserted as if they were written at the point where the.include
directive occurs.The include file is searched for in the current directory and in runtime/parrot/include, in that order. The first file of that name to be found is included.
{{ Check the search order of the include directive and whether it's complete }}
.macro
<identifier> [<parameters>]The
.macro
directive starts the definition of a macro named by the specified identifier. The optional parameter list is a comma-separated list of identifiers, enclosed in parentheses. See.endm
for ending the macro definition..endm
Closes a macro definition.
.macro_const
<identifier> (<literal>|<reg>).macro_const PI 3.14
The
.macro_const
directive is a special type of macro; it allows the user to use a symbolic name for a constant value. Like.macro
, the substitution occurs at compile time. It takes two arguments (not comma separated), the first is an identifier, the second a constant value or a register.
The macro layer is completely implemented in the lexical analysis phase. The parser does not know anything about what happens in the lexical analysis phase.
When the .include
directive is encountered, the specified file is opened and the following tokens that are requested by the parser are read from that file.
A macro expansion is a dot-prefixed identifier. For instance, if a macro was defined as shown below:
.macro foo(bar)
...
.endm
this macro can be expanded by writing .foo(42)
. The body of the macro will be inserted at the point where the macro expansion is written.
A .macro_const
expansion is more or less the same as a .macro
expansion, except that a constant expansion cannot take any arguments, and the substitution of a .macro_const
contains no newlines, so it can be used within a line of code.
Macro parameter list
The parameter list for a macro is specified in parentheses after the name of the macro. Macro parameters are not typed.
.macro foo(bar, baz, buz)
...
.endm
The number of arguments in the call to a macro must match the number of parameters in the macro's parameter list. Macros do not perform multidispatch, so you can't have two macros with the same name but different parameters. Calling a macro with the wrong number of arguments gives the user an error.
If a macro defines no parameter list, parentheses are optional on both the definition and the call. This means that a macro defined as:
.macro foo
...
.endm
can be expanded by writing either .foo
or .foo()
. And a macro definition written as:
.macro foo()
...
.endm
can also be expanded by writing either .foo
or .foo()
.
{{ NOTE: this is a change from the current implementation, which requires the definition and call of a zero-parameter macro to match in the use of parentheses. }}
Heredoc arguments
Heredoc arguments are not allowed when expanding a macro. This means that the following is not allowed:
.macro foo(bar) ... .endm .foo(<<'EOS') This is a heredoc string. EOS
{{ NOTE: This is likely because the parsing of heredocs happens later than the preprocessing of macros. Might be nice if we could parse heredocs at the macro level, but not a high priority. compilers/pirc/new can do this, but there's a bug in the heredoc handling on Win32 XP using MSVS. }}
Using braces, { }, allows you to span multiple lines for an argument. See runtime/parrot/include/hllmacros.pir for examples and possible usage. A simple example is this:
.macro foo(a,b) .a .b .endm .sub main .foo({ print "1" print "2" }, { print "3" print "4" }) .end
This will expand the macro
foo
, after which the input to the PIR parser is:.sub main print "1" print "2" print "3" print "4" .end
which will result in the output:
1234
{{ NOTE: braced arguments does not work correctly yet in compilers/pirc/new }}
Unique local labels
Within the macro body, the user can declare a unique label identifier using the value of a macro parameter, like so:
.macro foo(a)
...
.label $a:
...
.endm
Unique local variables
Within the macro body, the user can declare a local variable with a unique name.
.macro foo()
...
.macro_local int b
...
.b = 42
print .b # prints the value of the unique variable (42)
...
.endm
The .macro_local
directive declares a local variable with a unique name in the macro. When the macro .foo()
is called, the resulting code that is given to the parser will read as follows:
.sub main
.local int local__foo__b__2
...
local__foo__b__2 = 42
print local__foo__b__2
.end
The user can also declare a local variable with a unique name set to the symbolic value of one of the macro parameters.
.macro foo(b)
...
.macro_local int $b
...
.$b = 42
print .$b # prints the value of the unique variable (42)
print .b # prints the value of parameter "b", which is
# also the name of the variable.
...
.endm
So, the special $
character indicates whether the symbol is interpreted as just the value of the parameter, or that the variable by that name is meant. Obviously, the value of b
should be a string.
The automatic name munging on .macro_local
variables allows for using multiple macros, like so:
.macro foo(a)
.macro_local int $a
.endm
.macro bar(b)
.macro_local int $b
.endm
.sub main
.foo("x")
.bar("x")
.end
This will result in code for the parser as follows:
.sub main
.local int local__foo__x__2
.local int local__bar__x__4
.end
Each expansion is associated with a unique number; for labels declared with .macro_label
and locals declared with .macro_local
expansions, this means that multiple expansions of a macro will not result in conflicting label or local names.
Ordinary local variables
Defining a non-unique variable can still be done, using the normal syntax:
.macro foo(b)
.local int b
.macro_local int $b
.endm
When invoking the macro foo
as follows:
.foo("x")
there will be two variables: b
and x
. When the macro is invoked twice:
.sub main
.foo("x")
.foo("y")
.end
the resulting code that is given to the parser will read as follows:
.sub main
.local int b
.local int local__foo__x
.local int b
.local int local__foo__y
.end
Obviously, this will result in an error, as the variable b
is defined twice. If you intend the macro to create unique variables names, use .macro_local
instead of .local
to take advantage of the name munging.
EXAMPLES
Subroutine Definition
.sub _sub_label [<subflag>]*
.param int a
.param int b
.param int c
...
.begin_return
.return xy
.end_return
...
.end
Subroutine Call
.const .Sub $P0 = "_sub_label"
$P1 = new 'Continuation'
set_addr $P1, ret_addr
...
.local int x
.local num y
.local str z
.begin_call
.arg x
.arg y
.arg z
.call $P0, $P1 # r = _sub_label(x, y, z)
ret_addr:
.local int r # optional - new result var
.result r
.end_call
NCI Call
load_lib $P0, "libname"
dlfunc $P1, $P0, "funcname", "signature"
...
.begin_call
.arg x
.arg y
.arg z
.nci_call $P1 # r = funcname(x, y, z)
.local int r # optional - new result var
.result r
.end_call
Subroutine Call Syntactic Sugar
... # variable decls
r = _sub_label(x, y, z)
(r1[, r2 ...]) = _sub_label(x, y, z)
_sub_label(x, y, z)
This also works for NCI calls, as the subroutine PMC will be a NCI sub, and on invocation will do the Right Thing. Instead of the label a subroutine object can be used too:
find_global $P0, "_sub_label"
$P0(args)
Methods
.namespace [ "Foo" ]
.sub _sub_label :method [,Subpragma, ...]
.param int a
.param int b
.param int c
...
self."_other_meth"()
...
.begin_return
.return xy
.end_return
...
.end
The variable "self" automatically refers to the invocating object, if the subroutine declaration contains "method".
Calling Methods
The syntax is very similar to subroutine calls. The call is done with meth_call
which must immediately be preceded by the .invocant
:
.local pmc class
.local pmc obj
newclass class, "Foo"
new obj, class
.begin_call
.arg x
.arg y
.arg z
.invocant obj
.meth_call "_method" [, $P1 ] # r = obj."_method"(x, y, z)
.local int r # optional - new result var
.result r
.end_call
The return continuation is optional. The method can be a string constant or a string variable.
Returning and Yielding
.return ( a, b ) # return the values of a and b
.return () # return no value
.return func_call() # tail call function
.return o."meth"() # tail method call
Similarly, one can yield using the .yield directive
.yield ( a, b ) # yield with the values of a and b
.yield () # yield with no value
Stack calling conventions
Arguments are saved in reverse order onto the user stack:
.arg y # save args in reversed order
.arg x
call _foo #(r, s) = _foo(x,y)
.local int r
.local int s
.result r # restore results in order
.result s #
and return values are restored in argument order from there.
.sub _foo # sub foo(int a, int b)
saveall
.param int a # receive arguments from left to right
.param int b
...
.return mi # return (pl, mi), push results
.return pl # in reverse order
restoreall
ret
.end
Pushing arguments in reversed order on the user stack makes the left most argument the top of stack entry. This allows for a variable number of function arguments (and return values), where the left most argument before a variable number of following arguments is the argument count.
ATTACHMENTS
N/A
FOOTNOTES
N/A
REFERENCES
See docs/imcc/macros.pod
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 191:
Non-ASCII character seen before =encoding in 'utf8:unicode:"«"'. Assuming UTF-8