NAME

FrameMaker::MifTree - A MIF Parser

VERSION

This document describes version 0.06, released 24 March 2004.

SYNOPSIS

use FrameMaker::MifTree;
my $mif = FrameMaker::MifTree->new;
$mif->parse_miffile('filename.mif');
@strings = $mif->daughters_by_name("String", 0);
print $strings[0]->string;
$strings[3]->string("Just another new string.");
$mif->dump_mif('newmif.mif');

DESCRIPTION

The FrameMaker::MifTree class is implemented as a Tree::DAG_Node subclass, and thus inherits all the methods of that class. Two methods are overridden. Please read Tree::DAG_Node to see what other methods are available.

MIF (Maker Interchange Format) is an Adobe FrameMaker file format in ASCII, consisting of statements that create an easily parsed, readable text file of all the text, graphics, formatting, and layout constructs that FrameMaker understands. Because MIF is an alternative representation of a FrameMaker document, it allows FrameMaker and other applications to exchange information while preserving graphics, document content, and format.

This document does not tell you what the syntax of a MIF file is, nor does it document the meaning of the MIF statements. For this, please read (and re-read) the MIF_Reference.pdf, provided by Adobe.

MifTree not only knows the MIF syntax, but it also has some understanding of the allowed structures (within their contexts) and attribute types. The file FrameMaker/MifTree/MifTreeTags holds all the valid MIF statements and the attribute type for every statement. This file may need some improvement, as it is created by analyzing a large collection of MIF files written by FrameMaker (and an automatic analysis of the MIF Reference, which showed several typos and inconsistencies in that manual). The current file is for MIF version 7.00.

Dependencies

This class implementation depends on the following modules, all available from CPAN:

  • Tree::DAG_Node

  • IO::Tokenized and IO::Tokenized::File and the custom-made IO::Tokenized::Scalar

  • IO::Stringy (only IO::Scalar is needed)

Overridden Methods

add_daughters(LIST)

Adds a list of daughter object to a node. The difference with the DAG_Node method is that it checks for a valid MIF construct. Only the mother/daughter relationship is checked.

attributes(VALUE)

The attributes method of the FrameMaker::MifTree class does not require a reference as an attribute, as does the DAG_Node equivalent. As an extra, the method checks if the method is called on a leaf, since the MIF structure does not allow attributes on non-ending nodes. The method reads/sets the raw attribute, no string conversion, path encoding/decoding or value extraction is done. To obtain or set one of those values, use the specific "Attribute Methods" mentioned below.

Quick Creators

The following methods can be used instead of the DAG_Node standard methods to build your MIF structure. It's just a lazy way of adding daughters, but it improves readability of your code if you create something like:

my $mif = FrameMaker::MifTree->new->add_node(
  AFrames => FrameMaker::MifTree->add_node(
    Frame => FrameMaker::MifTree->add_node(
      ImportObject => FrameMaker::MifTree->add_leaf(
        ImportObFileDI => encode_path('c:\bar\foo.eps'))
    ),
    FrameMaker::MifTree->add_node(
      ImportObject => FrameMaker::MifTree->add_leaf(
        ImportObFileDI => encode_path('../../foo/boo.eps'))
    )
  )
);
add_leaf(MIFSTATEMENT, ATTRIBUTE or GRANDDAUGHTERLIST)

Adds a new daughter to the object. The first argument specifies the name, all the following arguments are taken either as the attribute for the leaf, or as a list of granddaughter objects to add to the newly created daughter. (In MIFTree world, newly born daughters mature in split seconds.)

add_node(MIFSTATEMENT, ATTRIBUTE or GRANDDAUGHTERLIST)

An exact synonym for the add_leaf method.

add_facet()

Adds a facet to the object. In DAG_Node tree terms, this is implemented as a leaf with the name "_facet" and a filehandle to a temp file as its attribute.

Search in Tree

$OBJ->daughters_by_name(NAMESTRING, RECURSE)

Find all daughters that listen to the name NAMESTRING, either walking the tree (RECURSE is true), or only on the mother's daughters (RECURSE false or omitted (the latter throws a warning that it will not recurse -- I've spent too many time debugging code where I forgot to add the RCURSE parameter). Returns the first object in scalar context, or a list of all found objects in list context.

Maybe one day I'll add magic to this function so you get the next item if you call the method on the same object without arguments.

Note that "daughter_by_name" is an exact alias for this method.

$OBJ->daughters_by_name_and_attr(NAMESTRING, ATTRIBUTE, RECURSE)

Find all daughters that listen to the name NAMESTRING and have the raw attribute ATTRIBUTE, either walking the tree (RECURSE is true), or only on the mother's daughters (RECURSE false or omitted). Returns the first object in scalar context, or a list of all found objects in list context. ATTRIBUTE must be raw data, so use quote, unquote, encode_path and decode_path as appropriate.

If you specify an empty string or undef as the NAMESTRING, this method will just look for ATTRIBUTE.

Note that "daughters_by_name_and_attr" is an exact alias for this method.

$OBJ->find_string(QUOTED_REGEX)

Returns a list of all strings that match QUOTED_REGEX under $OBJ. When called in scalar context, only the first match is returned.

Attribute Methods

$OBJ->string(STRING)

Reads or sets the object's attribute as a MIF string. The method just calls quote and unquote as appropriate.

$OBJ->pathname(PATHSTRING)

Returns the object's attribute as local pathname, or sets it to the device independent pathname. The method just calls encode_path and decode_path as appropriate. PATHSTRING must also be a local pathname.

$OBJ->abs_pathname(FROMROOT)

Returns the object's attribute as a local pathname. The method just calls decode_path, passing on the FROMROOT argument. Use this method if you want to make sure that you always receive absolute pathnames, independently from what is stored in the attribute.

$OBJ->boolean(BOOLEAN)

Returns or sets the object's TRUE or FALSE value.

$OBJ->measurements(LIST)

Returns or sets a list of measurements. When called in scalar context, only the first measurement is returned. Everything is in the default unit of measurement. (Can be set using FrameMaker::MifTree->default_unit. If this variable is set to the empty string (which also happens to be the default), points are output.) You always get the values without the unit specifier, so calculations can be made directly on this. To get a value from the list, do something like:

my $q;
$q = FrameMaker::MifTree->new->add_leaf(
  PgfCellMargins => "0.0 pt 1.0 pt 2.0 pt 3.0 pt"
);
my $k = ($q->measurements)[1];
print "k is now: $k\n"            # prints "k is now: 1"

In MIF, a maximum of four values can be supplied, but this is never checked by this method.

$OBJ->percentage(FRACTION)

Returns or sets the object's percentage value as a fraction (1 = 100%).

$OBJ->facet_data()

Returns the object's facet data as a list of lines. (Use a syswrite to facet_handle to set the objects data. Not a very elegant implementation, but I consider a facet to be rather esoteric, and we have to be efficient on memory usage as well...)

$OBJ->facet_handle()

Returns the filehandle to the object's facet data. Since the temporary file is sysopened, you should use syswrite instead of print to respect the buffering considerations.

FrameMaker::MifTree->default_unit(UNIT)

This class method returns or sets the global default units of measurement. See convert for a list of valid assignments.

FrameMaker::MifTree's default units of measurement can (and probably will) differ from the default <Units> that are specified in the MIF file.

The default for default_unit is an empty string, which means that no unit specifier will be output, and all values are in "points".

Tests on Tree Object

$OBJ->is_node()

Tests if the object is a valid MIF node statement. That is, if its name occurs in the %mifnodes hash. Returns a list of valid daughters when a match is found. (In my terminology, nodes can have daughters, whereas leaves don't.)

$OBJ->is_leaf()

Tests if the object is a valid MIF leaf statement and thus can have an attribute value. The name is just looked up in the %mifleaves hash.

$OBJ->allows_daughter(DAUGHTEROBJECT)

Checks if a mother object can have a specific daughter object. I just thought this could come in handy when you want to bind one object tree to another.

$OBJ->check_attribute

Checks if the attribute conforms to the type. Currently the following types are defined:

0xnnn
ID
L_T_R_B
L_T_W_H
W_H
W_W
X_Y
X_Y_W_H
boolean
data
degrees
dimension
empty *)
integer
keyword
number
pathname
percentage
seconds_microseconds
string
tagstring
*) no attribute allowed; some leaves and all nodes have this

The function returns TRUE if the attribute seems valid, and FALSE if there is an error. Use get_attribute_error to see the error.

$OBJ->get_attribute_error

Returns a meaningful text string if the attribute appears to be invalid.

$OBJ->validate(FROMROOT)

Not yet implemented.

Validates a MIF tree object. If you set FROMROOT to true, the validation starts from $OBJ->root, and special checking is done on the root object. This special behaviour is needed because the method cannot know if a FrameMaker::MifTree object is to represent a complete MIF file, and not just a fragment. So please remember always to set FROMROOT if you want to validate a complete MIF tree, even if $OBJ already points to the root object.

From/to MIF Syntax

LIST = $obj->dump_mif()

Dumps out the current tree as a list of MIF statements in valid MIF file syntax. You can write the resulting list to a file. The method tries to mimic the Adobe MIF parser behaviour as closely as possible. Please note that this method can be memory intensive, since it creates a whole new copy of your MIF tree in memory. If you just want to write the MIF tree to a file, you may want to use dump_miffile instead.

LIST = $obj->dump_miffile(FILENAME)

Dumps out the current tree of MIF statements into a valid MIF file syntax. The method returns with a FALSE result if the file cannot be written.

$OBJ->parse_mif(STRING)

Parses a string of MIF statements into the object. This is also a very quick way to set up an object tree:

my $new_obj = FrameMaker::MifTree->new();
$new_obj->parse_mif(<<ENDMIF);
<MIFFile 7.00># The only required statement
<Para # Begin a paragraph
<ParaLine# Begin a line within the paragraph
<String `Hello World'># The actual text of this document
> # end of Paraline #End of ParaLine statement
> # end of Para #End of Para statement
ENDMIF

Implemented by tying the scalar to a filehandle and calling IO::Tokenizer on the resulting handle.

The parser currently has the following limitations:

  • All comments are lost.

  • Macro statements are not (yet) implemented.

  • Include statements are not (yet) implemented.

Maybe I'll do something about it. Someday.

$OBJ->parse_miffile(FILENAME)

Parses a file from disk into a DAG_Node tree structure. See parse_mif for details.

Old-style Functions

All these functions are exported by default.

quote(STRING)

Quotes a string with MIF style quotes, and escapes forbidden characters. Backslashes, backticks, single quotes, greater-than and tabs are escaped, non-ASCII values are written in their hexadecimal representation. So:

Some `symbols': > \Ø¿!>

is written as

`Some \Qsymbols\q: \> \\\xaf \xc0 !'

As a special case, escaped hexadecimals are preserved in the input string. If you want a literal \x00 string, precede it with an extra backslash.

print quote("\x09 ");     # prints `\x09 ', a forced return in FrameMaker
print quote("\\x09 ");    # prints `\\x09 '; this will show up literally
                          # as \x09 in FrameMaker
unquote(STRING)

The opposite action. Surrounding quotes are removed and all escaped sequences are transliterated into their original character.

encode_path(STRING)

Encodes path names to the MIF path syntax. Usage:

$mifPathString = encode_path('D:\Dos\Path\With\Backslashes\Filename');
$mifPathString = encode_path('..\..\Also\Relative\Path\Is\Allowed\Filename');

The path name must not be in a MIF quoted style. It returns the device independent path name with the quotes.

decode_path(STRING, ROOTPATH)

Usage:

print decode_path ('<v\>C:<c\>Mydir<c\>Subdir<c\>Filename');
# prints C:/Mydir/Subdir/Filename
print decode_path ('<u\><u\><c\>Subdir<c\>Filename');
# prints ../../Subdir/Filename

Currently only Windows path names are supported (meaning that Unix and MacOS style path remain untested). MIF string quotes are removed. ROOTPATH, if specified, is the path that is prepended if STRING happens to be a relative path.

convert(VALUE_AND_OLDUNIT, NEWUNIT, SUPPRESSUNIT)

Converts a value in one unit of measurement into another. If you leave out the unit of measurement it defaults to FrameMaker::MifTree->default_unit (not to the MIF document's default unit of measurement!). Other measurements are:

{
  pt         => 1 / 72,
  point      => 1 / 72,
  "          => 1,
  in         => 1,
  mm         => 1 / 25.4,
  millimeter => 1 / 25.4,
  cm         => 1 / 2.54,
  centimeter => 1 / 2.54,
  pc         => 1 / 6,
  pica       => 1 / 6,
  dd         => 0.01483,
  didot      => 0.01483,
  cc         => 12 * 0.01483,
  cicero     => 12 * 0.01483
}

The optional argument SUPPRESSUNIT determines if the unit of measurement needs to be written in the result. Note that you won't get a unit of measurement included in your result when you leave out NEWUNIT and specify FrameMaker::MifTree->default_unit to be the empty string, even if you set SUPPRESSUNIT to be false. In that case the returned value is in points. So

FrameMaker::MifTree->default_unit('');
print convert("12.0 didot");            # prints the value in points: 12.8131
FrameMaker::MifTree->default_unit('mm');
print convert("12.0 didot", "pt", 1);   # also prints 12.8131
FrameMaker::MifTree->default_unit('pt');
print convert("12.0 didot", '', 1);     # also prints 12.8131

All values are rounded to 4 decimals.

SEE ALSO

  • Adobe's MIF_Reference.pdf, included in FrameMaker's online documentation.

  • http://www.miffy.com, as this module was formerly called Miffy.pm

AUTHOR

Roel van der Steen, roel-perl@st2x.net

COPYRIGHT AND LICENSE

Copyright 2004 by ITP

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

2 POD Errors

The following errors were encountered while parsing the POD:

Around line 659:

=cut found outside a pod block. Skipping to next block.

Around line 930:

Non-ASCII character seen before =encoding in '\Ø¿!>'. Assuming CP1252