NAME

SuffixTree - Efficient string manipulation data structure interface for Perl.

SYNOPSIS

use SuffixTree;

my $str = "mississippi";
my $tree=create_tree($str);
print_tree($tree);
my $position = find_substring($tree, "ssis");

printf("\nPosition of ssis in mississippi is %ld.\n\n", $position);

delete_tree($tree); # NOTICE: this method will soon become deprecated 

DEPRECATED SYNOPSIS

use SuffixTree;

my $str = "mississippi";
my $tree=ST_CreateTree($str, length($str));
ST_PrintTree($tree);
my $position = ST_FindSubstring($tree, "ssis", 4);

printf("\nPosition of ssis in mississippi is %ld.\n\n", $position);

ST_DeleteTree($tree);

DESCRIPTION

The intention of this project is to provide an open-source implementation for an efficient data structure for strings manipulation - the Suffix Tree.

The code was written with as much consistency with the theoretical algorithm as possible (see references). It provides a set of interface functions for creating and searching the tree.

The suffix tree is implemented in ANSI-C. The code is written based on the original algorithm by E.Ukkonen. This is the Perl interface for the underlying ANSI-C implementation.

FUNCTIONS

All functions are exported by default. Please note that all these interface functions were automatically extracted from the ANSI-C header file, so they might not behave as Perlish as you'd expect them to. This is something we will definitly address in the future.

$tree = create_tree($string)

Allocates memory for the tree and starts Ukkonen's construction algorithm. Parameters: A string. Returns a reference to the tree.

$position = find_substring($tree, $substring)

Searches for a string in the tree. It traverses the tree down starting its root like in a regular trie. Parameters: the tree to search in, a substring to look for. Returns the 1-based position it was found in the source string or 0 if string is not in the tree.

Prints the tree. Parameters: the tree to print.

delete_tree($tree)

Deletes a suffix tree. Parameters: the tree to delete. Returns : void.

DEPRECATED FUNCTIONS

All functions are exported by default. Please note that all these interface functions were automatically extracted from the ANSI-C header file, so they might not behave as Perlish as you'd expect them to. This is something we will try to address in the future.

$tree = ST_CreateTree($string, length($string))

Allocates memory for the tree and starts Ukkonen's construction algorithm. Parameters: A string, length of the string. Returns a reference to the tree.

$position = ST_FindSubstring($tree, $substring, length($substring))

Searches for a string in the tree. It traverses the tree down starting its root like in a regular trie. Parameters: the tree to search in, a substring to look for, length of substring. Returns the 1-based position it was found in the source string or 0 if string is not in the tree.

ST_PrintTree($tree)

Prints the tree. Parameters: the tree to print.

ST_DeleteTree($tree)

Deletes a suffix tree. Parameters: the tree to delete. Returns : void.

BUGS

This Perl interface was mostly built automatically (using SWIG). Little to no attention was given to testing. In future relases of this Perl Module (along with its underlying ANSI-C implementation) we hope to fix all problems that might currenly interfere with successful usage of this module. Please send bug reports to the current maintainer(s) of this module.

FUTURE WORK

[1] A better Perl-ish interface

[2] Building tests for this module (for the `make test` part of the installation)

[3] Object Oriented like usage

PORTABILITY

Please read the README file for information.

SEE ALSO

L<https://github.com/daoswald/SuffixTree.git> - This module's Github
repository.

L<http://en.wikipedia.org/wiki/Suffix_tree> - Wikipedia's Suffix Tree
explanation.

AUTHOR

Shlomo Yona <yona@cs.technion.ac.il> is the original author.

David Oswald <davido@cpan.org> is the current maintainer.

THANKS TO

Offer Kaye for useful ideas and support.

LICENSE AND COPYRIGHT

Copyright (c) 2002, 2003, 2012 Shlomo Yona. All rights reserved.

This library is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.

See http://dev.perl.org/licenses/ for more information.