Parrot PMC/Object Design Meeting Notes

During the Chicago Hackathon 2006, Jerry Gay, Matt Diephouse, chromatic, Leo Toetch, Jonathan Worthington and Chip Salzenberg (arriving late) met to discuss problems and proposed solutions for Parrot PMCs and objects. Previous to the meeting, wiki pages were set up to gather a list of problems, they can be found at http://rakudo.org/hackathon-chicago/index.cgi?object and http://rakudo.org/hackathon-chicago/index.cgi?pmc. A log of the discussion that took place can be found at http://www.parrotcode.org/misc/parrotsketch-logs/irclog.parrotsketch-200611/irclog.parrotsketch.20061112.

This summary has been compiled from these sources, and from conversations not recorded.

List of problems and proposed solutions for PMCs:

Not all PMCs properly serialize properties by calling the default implementation

This is one of many troubles with properties. Also, properties were invented at a time when it looked like Perl 6 needed them, however this suspected use case has so far proven incorrect. A general property system means every PMC might get properties at runtime, so all PMCs share the burden all the time of an uncommonly-used "feature". Furthermore, a property as currently implemented is one full-blown hash per object, typically to store a single value. The same effect can easily be achieved by using one AddrRegistry Hash to store values in it. Finally, if some class system needs properties, it can implement it itself.

Recommendation: Deprecate property support in PMCs. We may put them back in later, but if we do, it will be for a good reason.

Allison: What is the proposed alternate strategy for altering the attribute structure of an object at runtime? I'm still not in favor of removing properties. The distinction isn't a high-level one, it's a low-level implementation distinction. Would it help if we call them "static attributes" and "dynamic attributes"?

Attributes use external data slot (pmc_ext)

When attributes are used on a PMC, they are stored in the single external data segment associated with a PMC. This prevents the use of attributes and storage of non-attribute data in the PMC external data segment, limiting their usefulness to all but the most simple PMC types.

Recommendation: Implement differently-sized PMCs. The proposed pddXX_pmc.pod has been reviewed, and the implementation of a prototype solution has been requested. Work on this prototype is targeted to begin in trunk after the 0.4.7 release.

Allison: Agreed.

DYNSELF is the common case, but the common case should be SELF

The DYNSELF macro is closer to the true OO concept of 'self'; if anything is worthy of the name SELF, it is DYNSELF. DYNSELF does dispatching, while SELF statically calls the method in the current class.

Recommendation: Throughout the source, rename SELF to STATIC_SELF, and rename DYNSELF to SELF. Additionally, direct access to VTABLE methods should be reviewed, and SELF should be used where possible to increase clarity and maintainability (this is a good CAGE task.) Also, this should become a coding standard for PMCs.

Allison: OK on the rename of DYNSELF to SELF, but we need a better name than STATIC_SELF. Where is the non-dispatching version likely to be used and why?

Att: SELF is just a synonym for pmc - the object - above is talking about SELF.method() and alikes. See lib/Parrot/Pmc2c.pm.

composition is almost non-existent (roles would be nice)

The current class composition uses interfaces, which seems inadequate. Roles may make better composition units, but may not make supporting Perl 5 OO easier. Leo, chromatic, Matt, and Jonathan traded ideas on the subject. It seems more investigation and discussion is in order.

Recommendation: Leo, chromatic, Matt, and Jonathan should exchange ideas on the subject in a meeting within the next two weeks. Hopefully this exchange of ideas will lead to clarity on a proposed solution to a class composition model.

Allison: Definitely in favor of a composition model.

Someone had mentioned the idea of splitting the PMC vtable functions into groups of related functions that you could choose to implement or not implement. All keyed vtable functions would go in one group, for example.

Leo pointed to a p2 thread on the topic: http://groups.google.at/group/perl.perl6.internals/browse_thread/thread/9ea5d132cb6a328b/4b83e54a13116d7c?lnk=st&q=interface+vtable+Toetsch&rnum=2#4b83e54a13116d7c

Others thought this idea sounded intriguing, and that it was worth pursuing.

Recommendation: Elaboration on this idea is required in order to focus thought on design considerations. No timeframe has been set.

Allison: Yes, needs a more complete proposal.

List of problems and proposed solutions for objects:

Class namespace is global

At present it is not possible for a HLL to define a classname that has already been used by another HLL. (Example: PGE/Perl 6 cannot define a class named 'Closure' because Parrot already has one.) Chip and chromatic discussed this before the meeting, and Chip has a solution 'almost ready.'

Discussion moved into class/namespace relationships, and metaclasses as it relates to method dispatch order, and the potential to incorporate roles instead of metaclasses, which brought the tangent back to previous discussion on class composition, before the discussion was tabled.)

Recommendation: Chip, based on prior thought, as well as this discussion, will detail his plan. We know it includes a registry of class names for each HLL, but more will be revealed soon.

Allison: das ist gut.

PMC vtable entries don't work in subclasses (or subsubclasses)

Currently things are handled by deleg_pmc.pmc and default.pmc. Unfortunately, it appears that calls to vtable entries in default.pmc take place before delegation to a superclass. Patrick proposes that this be switched somehow; e.g, inheritance should be preferred over default vtable entries. See http://rt.perl.org/rt3//Ticket/Display.html?id=39329.

Recommendation: Inherit from default.pmc unless you have a parent. Always check inheritance when dispatching, if necessary. Chip will look specifically at Capture PMCs and attempt to provide a solution without architectural changes, enabling Patrick's work to continue.

Allison: das ist gut. Also looks like a good point for potential architectural changes in the updated objects PDD.

PMC methods aren't inherited by ParrotObject pmcs

A METHOD defined in a PMC won't work unless it's been specifically coded to work in a PIR-based subclass. See http://xrl.us/s7ns.

Recommendation: All direct data access should go through accessors, except for within the accessors themselves. This involves a code review of all core and custom PMCs, which has been recommended above as well. This may require further discussion, as Patrick did not seem confident that this statement was sufficient to resolve the situation.

Patrick's after-meeting response: The unresolved issue I'm seeing is that every PMC has to be aware of the possibility that SELF is a ParrotObject instead of the PMC type. For example, src/pmc/capture.pmc currently has its methods as

METHOD PMC* get_array() {
    PMC* capt = SELF;
    /* XXX:  This workaround is for when we get here as
             part of a subclass of Capture */
    if (PObj_is_object_TEST(SELF))
        capt = get_attrib_num(PMC_data(SELF), 0);
    CAPTURE_array_CREATE(INTERP, capt);
    return CAPTURE_array(capt);
}

It's the "if (PObj_is_object_TEST(SELF))" clause that bugs me -- does every PMC METHOD have to have something like this in order for subclassing to work properly? That feels wrong. And I don't see how the solution of "all direct data access goes through accessors" applies to this particular situation... although I could just be reading it wrong. --Pm

Allison: ParrotObjects and PMCs were intended to be completely substitutable. Clearly they aren't yet, but the solution to the problem is not to add more and more checks for whether a particular PMC is an Object, but to give ParrotObjects and low-level PMCs the ability to be subclassed through a single standard interface (even if each has completely different code behind the interface making it work).

getattribute/setattribute efficiency

Dan used to often remark that sequences like the following were "very slow":

$P0 = getattribute obj, "foo"
$P1 = getattribute obj, "bar"
$P2 = getattribute obj, "baz"

Instead, Dan suggested always using classoffset to obtain attributes:

$I0 = classoffset obj, "FooClass"
$P0 = getattribute obj, $I0 # which attr is this?
inc $I0
$P0 = getattribute obj, $I0 # and that?
inc $I0
$P0 = getattribute obj, $I0

Unfortunately, this doesn't seem to be very practical in many respects. Can we at least get a declaration from the designers about the appropriate style to use for object attributes? I much prefer the former to the latter, and if it's a question of efficiency we should make the former more efficient somehow.

The latter takes 2 opcodes, which is per se suboptimal, The former is much more readable, just one opcode and optimizable and never subject of any caching effects of the offset (e.g. what happens, if the getattribute has some nasty side-effects like adding new attributes?)

Oh well, and classoffset does of course not work for Hash-like or other objects.

Recommendation: Best practice is to use named attribute access. Optimizations, if necessary, will be addressed closer to 1.0. Issues with classoffset and hash-like objects will be addressed in the future as necessary.

Allison: Agreed.

AUTHOR

Jerry Gay mailto:jerry.gay@gmail.com