NAME

Parse::Man::DOM - parse nroff-formatted manpages and return a DOM tree

SYNOPSIS

use Parse::Man::DOM;

my $parser = Parse::Man::DOM->new;

my $document = $parser->from_file( "my_manpage.1" );

print "The manpage name is", $document->meta( "name" ), "\n";

DESCRIPTION

This subclass of Parse::Man returns an object tree representing the parsed content of the input file. The returned result will be an object of the Parse::Man::DOM::Document class, which itself will contain other objects nested within it.

Parse::Man::DOM::Document

Represents the document as a whole.

meta

$meta = $document->meta( $key )

Returns a Parse::Man::DOM::Metadata object for the named item of metadata.

  • name

    The page name given to the .TH directive.

  • section

    The section number given to the .TH directive.

paras

@paras = $document->paras

Returns a list of Parse::Man::DOM::Heading or Parse::Man::DOM::Para or subclass objects, containing the actual page content.

Parse::Man::DOM::Metadata

Represents a single item of metadata about the page.

name

$name = $metadata->name

The string name of the metadata

value

$value = $metadata->value

The string value of the metadata

Parse::Man::DOM::Heading

Represents the contents of a .SH or .SS heading

level

$level = $heading->level

The heading level number; 1 for .SH, 2 for .SS

text

$text = $heading->text

The plain text string of the heading title

Parse::Man::DOM::Para

Represents a paragraph of formatted text content. Will be one of the following subclasses.

filling

$filling = $para->filling

Returns true if filling (.fi) is in effect, or false if no-filling (.nf) is in effect.

body

$chunklist = $para->body

Returns a Parse::Man::DOM::Chunklist to represent the actual content of the paragraph.

indent

$indent = $para->indent

Returns the indentation size in column count, if defined.

Parse::Man::DOM::Para::Plain

Represent a plain (.P or .PP) paragraph.

type

$type = $para->type

Returns "plain".

Parse::Man::DOM::Para::Term

Represents a term paragraph (.TP).

type

$type = $para->type

Returns "term".

term

$chunklist = $para->term

Returns a Parse::Man::DOM::Chunklist for the defined term name.

definition

$chunklist = $para->definition

Returns a Parse::Man::DOM::Chunklist for the defined term definition.

Parse::Man::DOM::Para::Indent

Represents an indented paragraph (.IP).

type

$type = $para->type

Returns "indent".

marker

$marker = $para->marker

Returns the indentation marker text, if defined.

Parse::Man::DOM::Para::Example

Represents an example paragraph (.EX / .EE).

type

$type = $para->type

Returns "example".

Parse::Man::DOM::Chunklist

Contains a list of Parse::Man::DOM::Chunk objects to represent paragraph content.

chunks

@chunks = $chunklist->chunks

Returns a list of Parse::Man::DOM::Chunk objects.

Parse::Man::DOM::Chunk

Represents a chunk of text with a particular format applied.

text

$text = $chunk->text

The plain string value of the text for this chunk.

font

$font = $chunk->font

The font name in effect for this chunk. One of "R", "B", "I" or "SM".

size

$size = $chunk->size

The size of this chunk, relative to the paragraph base of 0.

AUTHOR

Paul Evans <leonerd@leonerd.org.uk>