TITLE
C Structure Class
STATUS
Proposal.
AUTHOR
Leopold Toetsch
ABSTRACT
The ParrotClass PMC is the default implementation (and the meta class) of parrot's HLL classes. It provides attribute access and (TODO) introspection of attribute names. It is also handling method dispatch and inheritance.
C structures used all over in parrot (PMCs) and user-visible C structures provided by the {Un,}ManagedStruct
PMC dont't have this flexibility.
The proposed CStruct
PMC is trying to bridge this gap.
DESCRIPTION
The CStruct
PMC is the class PMC of classes, which are not based on PMC-only attributes but on the general case of a C structure. That is, the CStruct
is actually the parent class of ParrotClass
, which is a PMC-only special case. And it is the theoretical ancestor class of all PMCs (including itself :).
The relationship of CStruct
to other PMCs is like this:
PASM/PIR code C code
Class ParrotClass CStruct
Object ParrotObject *ManagedStruct
(other PMCs)
That is, it is the missing piece of already existing PMCs. The current *ManagedStruct PMCs are providing the class and object functionality in one and the same PMC (as BTW all other existing PMCs are doing). But this totally prevents proper inheritance and reusability of such PMCs.
The CStruct
class provides the necessary abstract backings to get rid of current limitations.
SYNTAX BITS
Constructing a CStruct
A typical C structure:
struct foo {
int a;
char b;
};
could be created in PIR with:
cs = subclass 'CStruct', 'foo' # or maybe cs = new_c_class 'foo'
addattribute cs, 'a'
addattribute cs, 'b'
The semantics of a C struture are the same as of a Parrot Class. But we need the types of the attributes too:
Handwavingly TBD 1)
with ad-hoc existing syntax:
.include "datatypes.pasm"
cs['a'] = .DATATYPE_INT
cs['b'] = .DATATYPE_CHAR
Handwavingly TBD 2)
with new variants of the addattribute
opcode:
addattribute cs, 'a', .DATATYPE_INT
addattribute cs, 'b', .DATATYPE_CHAR
Probably desired and with not much effort TBD 3):
addattribute(s) cs, <<'DEF'
int a;
char b;
DEF
The possible plural in the opcode name would match semantics, but it is not necessary. The syntax is just using Parrot's here documents to define all the attributes and types.
addattribute(s) cs, <<'DEF'
int "a";
char "b";
DEF
The generalization of quoted attribute names would of course be possible too, but isn't likely needed.
Syntax variant
cs = subclass 'CStruct', <<'DEF
struct foo {
int a;
char b;
};
DEF
I.e. create all in one big step.
Object creation and attribute usage
This is straight forward and conforming to current ParrotObjects:
o = new 'foo' # a ManagedStruct instance
setattribute o, 'a', 4711
setattribute o, 'b', 22
...
The only needed extension would be {get,set}attribute
variants with natural types.
Even (with nice to have IMCC syntax sugar):
o.a = 4711 # setattribute
o.b = 22
$I0 = o.a # getattribute
Nested Structures
foo_cs = subclass 'CStruct', 'foo'
addattribute(s) foo_cs, <<'DEF'
int a;
char b;
DEF
bar_cs = subclass 'CStruct', 'bar'
addattribute(s) bar_cs, <<'DEF'
double x;
foo cfoo; # contained foo structure
foo *fptr; # a pointer to a foo struct
DEF
o = new 'bar'
setattribute o, 'x', 3.14 # C-ish equivalent:
setattribute o, ['cfoo'; 'a'], 4711 # o.foo.a = 4711
setattribute o, ['fptr'; 'b'], 255 # o.fptr->b = 255
Attribute access is similar to current *ManagedStruct's hash syntax but with a syntax matching ParrotObjects.
Array Structures Elements
foo_cs = subclass 'CStruct', 'foo'
addattribute(s) foo_cs, <<'DEF'
int a;
char b[100];
DEF
Access to array elements automatically does bounds checking.
Possible future extemsios
cs = subclass 'CStruct', 'todo'
addattribute(s) foo_cs, <<'DEF'
union { # union keyword
int a;
double b;
} u;
char b[100] :ro; # attributes like r/o
DEF
Managed vs. Unmanaged Structs
The term "managed" in current structure usage defines the owner of the structure memory. ManagedStruct
means that parrot is the owner of the memory and that GC will eventually free the structure memory. This is typically used when C structures are created in parrot and passed into external C code.
UnManagedStruct
means that there's some external owner of the structure memory. Such structures are typically return results of external code.
E.g.:
$P0 = some_c_func() # UnManagedStruct result
assign $P0, foo_cs # assign a structure class to it
o = new 'foo_cs' # ManagedStruct instance
setattribute o, 'a', 100
setattribute o, ['b'; 99], 255 # set last elem
RATIONAL
Parrot as the planned interpreter glue language should have access to all possible C libraries and structures. It has to abstract the low-level bindings in a HLL independant way and should still be able to communicate all information "upstairs" to the HLL users.
But it's not HLL usage only, parrot itself is already suffering from lack of abstraction at PMC level.
Inheritance
I've implemented an OO-ified HTTP server named httpd2.pir. The HTTP::Connection
class ought to be a subclass of ParrotIO
(we don't have a base socket class, but ParrotIO would do it for now). This kind of inheritance isn't possible. The implementation is now a connection hasa ParrotIO, instead of isa. It's of course losing all inheritance with that which leads to delegation code and work arounds.
The same workarounds are all over SDL/* classes. There are layout helpers and raw structure accessores and what not. Please read the code. It's really not a problem of the implementation (which is totally fine) it's just the lack of usability of parrot (when it comes to native structures (or PMCs)).
All these experiments to use a C structures or a PMC as base class are ending with a has
relationship instead of the natural isa
. Any useful OO-ish abstraction is lost and is leading to clumsy code, and - no - implementing interfaces/traits/mixins can't help here, as these are all based on the abstraction, which is described here.
Inheritance and attribute access
This proposal alone doesn't solve all inheritance problems. It is also needed that the memory layout of PMCs and ParrotObjects deriving from PMCs is the same. E.g.
cl = subclass 'Integer', 'MyInt'
The int_val
attribute of the core Integer
type is located in the cache
union of the PMC. The integer item in the subclass is hanging off the data
array of attributes and worse it is a PMC too, not a natural int. This not only causes additional indirections (see deleg_pmc.pmc) but also negatively impacts Integer
PMCs, as all access to the int_val
has to be indirected through get_integer()
or set_integer_native()
to be able to deal with subclassed integers.
Again the implementation of above is: MyInt hasa Integer, instead of the desired isa int_val.
With the abstraction of a CStruct
describing the Integer
PMC and with differently sized PMCs, we can create an object layout, where the int_val
attribute of Integer
and MyInt
are at the same location and of the same type.
Given this (internal) definition of the Integer
PMC:
intpmc_cl = subclass 'CStruct', 'Integer'
addattribute(s) intpmc_cl, <<'DEF'
INTVAL int_val; # PMC internals are hidden
DEF
we can transparently subclass it as MyInt
, as all the needed information is present in the CStruct intpmc_cl
class.
Introspection, PMCs and more
cc = subclass 'CStruct', 'Complex'
addattribute(s) cc, <<'DEF'
FLOATVAL re;
FLOATVAL im;
DEF
This is the (hypothetical) description of a Complex
PMC class. An equivalent syntax can be translated by the PMC compiler to achieve the same result.
This definition of the attributes of that PMC provides automagically access to all the information stored in the PMC. All such access is currently hand-crafted in the complex.pmc. Not only that this accessor code could be abandoned (and unified with common syntax), all possible classes inheriting from that PMC could use this information.
Implementation
CStruct
is basically yet another PMC and can be implemented and put to functionality without any interference with existing code. It is also orthogonal with possible PMC layout changes.
The internals of CStruct
can vastly reuse code from src/objects.c to deal with inheritance or object instantiation. The main difference is that attributes have additionally a type attached to it and consequently that the attribute offsets are calculated differently depending on type, alignment, and padding. These calculations are already done in unmanagedstruct.pmc.
CStruct
classes can be attached to existing PMCs gradually (and by far not all PMCs need that abstract backing). But think e.g. of the Sub
PMC. Attaching a CStruct
to it, would instantly give access to all it's attributes and vastly simplify introspection.
Only the final step ("Inheritance and attribute access") needs all parts to play together.
All together now
- Differently sized PMCs
-
Provide the flexible PMC layout.
- CStruct classes
-
Are describing the structure of PMCs (or any C structure).
- R/O vtables
-
Prohibit modification of readonly PMCs like the
Sub
PMC. These are already coded within theSTM
project.
SEE ALSO
pddXX_pmc.pod (proposal for a flexible PMC layout)