NAME
SuffixTree - Efficient string manipulation data structure interface for Perl.
SYNOPSIS
use SuffixTree;
my $str = "mississippi";
my $tree=create_tree($str);
print_tree($tree);
my $position = find_substring($tree, "ssis");
printf("\nPosition of ssis in mississippi is %ld.\n\n", $position);
delete_tree($tree); # NOTICE: this method will soon become deprecated
DEPRECATED SYNOPSIS
use SuffixTree;
my $str = "mississippi";
my $tree=ST_CreateTree($str, length($str));
ST_PrintTree($tree);
my $position = ST_FindSubstring($tree, "ssis", 4);
printf("\nPosition of ssis in mississippi is %ld.\n\n", $position);
ST_DeleteTree($tree);
DESCRIPTION
The intention of this project is to provide an open-source implementation for an efficient data structure for strings manipulation - the Suffix Tree.
The code was written with as much consistency with the theoretical algorithm as possible (see references). It provides a set of interface functions for creating and searching the tree.
The suffix tree is implemented in ANSI-C. The code is written based on the original algorithm by E.Ukkonen. This is the Perl interface for the underlying ANSI-C implementation.
FUNCTIONS
All functions are exported by default. Please note that all these interface functions were automatically extracted from the ANSI-C header file, so they might not behave as Perlish as you'd expect them to. This is something we will definitly address in the future.
- $tree = create_tree($string)
-
Allocates memory for the tree and starts Ukkonen's construction algorithm. Parameters: A string. Returns a reference to the tree.
- $position = find_substring($tree, $substring)
-
Searches for a string in the tree. It traverses the tree down starting its root like in a regular trie. Parameters: the tree to search in, a substring to look for. Returns the position it was found in the source string or ST_ERROR if string is not in the tree. NOTE: We did not make sure how ST_ERROR 'looks like' in Perl. This should be further explained in future releases.
- print_tree($tree)
-
Prints the tree. Parameters: the tree to print.
- delete_tree($tree)
-
Deletes a suffix tree. Parameters: the tree to delete. Returns : void.
DEPRECATED FUNCTIONS
All functions are exported by default. Please note that all these interface functions were automatically extracted from the ANSI-C header file, so they might not behave as Perlish as you'd expect them to. This is something we will definitly address in the future.
- $tree = ST_CreateTree($string, length($string))
-
Allocates memory for the tree and starts Ukkonen's construction algorithm. Parameters: A string, length of the string. Returns a reference to the tree.
- $position = ST_FindSubstring($tree, $substring, length($substring))
-
Searches for a string in the tree. It traverses the tree down starting its root like in a regular trie. Parameters: the tree to search in, a substring to look for, length of substring. Returns the position it was found in the source string or ST_ERROR if string is not in the tree. NOTE: We did not make sure how ST_ERROR 'looks like' in Perl. This should be further explained in future releases.
- ST_PrintTree($tree)
-
Prints the tree. Parameters: the tree to print.
- ST_DeleteTree($tree)
-
Deletes a suffix tree. Parameters: the tree to delete. Returns : void.
BUGS
This Perl interface was mostly built automatically (using SWIG). Little to no attention was given to testing. In future relases of this Perl Module (along with its underlying ANSI-C implementation) we hope to fix all problems that might currenly interfere with successful usage of this module. Please send bug reports to the author(s) of this module.
FUTURE WORK
[1] A better Perl-ish interface
[2] Building tests for this module (for the `make test` part of the installation)
[3] Object Oriented like usage
PORTABILITY
Please read the README file for information.
SEE ALSO
http://cs.haifa.ac.il/~shlomo/suffix_tree/
AUTHOR
Shlomo Yona <shlomo@cs.haifa.ac.il>
THANKS TO
Offer Kaye (http://okaye.esmartweb.com/) for useful ideas and support.
COPYRIGHT
Copyright (c) 2002, 2003 Shlomo Yona. All rights reserved.
This library is free software. You can redistribute it and/or modify it under the same terms as Perl itself.
2 POD Errors
The following errors were encountered while parsing the POD:
- Around line 113:
You forgot a '=back' before '=head1'
- Around line 147:
You forgot a '=back' before '=head1'