NAME
Algorithm::Diff::Any - Perl module to find differences between files
VERSION
Version 1.001 ($Id: Any.pm 10595 2009-12-23 00:29:52Z FREQUENCY@cpan.org $)
DESCRIPTION
This is a simple module to select the best available implementation of the standard diff
algorithm, which works by effectively trying to solve the Longest Common Subsequence (LCS) problem. This algorithm is described in: A Fast Algorithm for Computing Longest Common Subsequences, CACM, vol.20, no.5, pp.350-353, May 1977.
However, it is algorithmically rather complicated to solve the LCS problem; for arbitrary sequences, it is an NP-hard problem. Simply comparing two strings together of lengths n and m is O(n x m). Consequently, this means the algorithm necessarily has some tight loops, which, for a dynamic language like Perl, can be slow.
In order to speed up processing, a fast (C/XS-based) implementation of the algorithm's core loop was implemented. It can confer a noticable performance advantage (benchmarks show a 54x speedup for the compact_diff
routine).
SYNOPSIS
use Algorithm::Diff::Any;
my $diff = Algorithm::Diff::Any->new(\@seq1, \@seq2);
For complete usage details, see the Object-Oriented interface description for the Algorithm::Diff module.
PURPOSE
The intent of this module is to provide single simple interface to the two (presumably) compatible implementations of this module, namely, Algorithm::Diff and Algorithm::Diff::XS.
If, for some reason, you need to determine what version of the module is actually being included by Algorithm::Diff::Any
, then:
print 'Backend type: ', $Algorithm::Diff::Any::DRIVER, "\n";
In order to force use of one or the other, simply load the appropriate module:
use Algorithm::Diff::XS;
my $diff = Algorithm::Diff::XS->new();
# or
use Algorithm::Diff;
my $diff = Algorithm::Diff->new();
COMPATIBILITY
This module was tested under Perl 5.10.1, using Debian Linux. However, because it's Pure Perl and doesn't do anything too obscure, it should be compatible with any version of Perl that supports its prerequisite modules.
If you encounter any problems on a different version or architecture, please contact the maintainer.
EXPORTABLE FUNCTIONS
The following functions are available for import into your namespace:
prepare
LCS
LCSidx
LCS_length
diff
sdiff
compact_diff
traverse_sequences
traverse_balanced
For full documentation, see the relevant functional descriptions in the Pure Perl implementation, Algorithm::Diff.
METHODS
new
Algorithm::Diff::Any->new( \@seq1, \@seq2, \%opts );
Creates a Algorithm::Diff::Any
object, based upon either the optimized C/XS version of the algorithm, Algorithm::Diff::XS, or falls back to the Pure Perl implementation, Algorithm::Diff.
Example code:
my $diff = Algorithm::Diff::Any->new( \@seq1, \@seq2 );
# or with options
my $diff = Algorithm::Diff::Any->new( \@seq1, \@seq2, \%opts );
This method will return an appropriate Algorithm::Diff::Any object or throw an exception on error.
Next
$diff->Next( $count )
See Algorithm::Diff for method documentation.
Prev
$diff->Prev( $count )
See Algorithm::Diff for method documentation.
Reset
$diff->Reset( $pos )
See Algorithm::Diff for method documentation.
Copy
$diff->Copy( $pos, $newBase )
See Algorithm::Diff for method documentation.
Base
$diff->Base( $newBase )
See Algorithm::Diff for method documentation.
Diff
$diff->Diff( )
See Algorithm::Diff for method documentation.
Same
See Algorithm::Diff for method documentation.
Code example:
$diff->Same( )
Items
$diff->Items( $seqNum )
See Algorithm::Diff for method documentation.
Range
$diff->Range( $seqNum, $base )
See Algorithm::Diff for method documentation.
Min
$diff->Min( $seqNum, $base )
See Algorithm::Diff for method documentation.
Max
$diff->Max( $seqNum, $base )
See Algorithm::Diff for method documentation.
Get
$diff->Get( @names )
See Algorithm::Diff for method documentation.
AUTHOR
Jonathan Yu <jawnsy@cpan.org>
CONTRIBUTORS
Your name here ;-)
ACKNOWLEDGEMENTS
Many thanks go to the primary authors and maintainers of the Pure Perl implementation of this algorithm, notably:
Mark-Jason Dominus <mjd-perl-diff@plover.com>
Ned Konz <perl@bike-nomad.com>
Tye McQueen <tyemq@cpan.org>
Thanks to Audrey Tang <cpan@audreyt.org>, author of Algorithm::Diff::XS, for recognizing the value of Joe Schaefer's <apreq-dev@httpd.apache.org> work on Algorithm::LCS
Neither the Pure Perl nor C/XS-based implementations of this module would have been possible without the work of James W. Hunt (Stanford University) and Thomas G. Szymanski (Princeton University), authors of the often-cited paper for computing longest common subsequences.
In their abstract, they claim that a running time of O(n log n) can be expected, with a worst-case time of O(n^2 log n) for two subsequences of length n.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc Algorithm::Diff::Any
You can also look for information at:
AnnoCPAN: Annotated CPAN documentation
CPAN Ratings
Search CPAN
CPAN Request Tracker
CPAN Testing Service (Kwalitee Tests)
CPAN Testers Platform Compatibility Matrix
REPOSITORY
You can access the most recent development version of this module at:
http://svn.ali.as/cpan/trunk/Algorithm-Diff-Any
If you are a CPAN developer and would like to make modifications to the code base, please contact Adam Kennedy <adamk@cpan.org>, the repository administrator. I only ask that you contact me first to discuss the changes you wish to make to the distribution.
FEEDBACK
Please send relevant comments, rotten tomatoes and suggestions directly to the maintainer noted above.
If you have a bug report or feature request, please file them on the CPAN Request Tracker at http://rt.cpan.org. If you are able to submit your bug report in the form of failing unit tests, you are strongly encouraged to do so.
SEE ALSO
Algorithm::Diff, the classic reference implementation for finding the differences between two chunks of text in Perl. It is based on the algorithm described in A Fast Algorithm for Computing Longest Common Subsequences, CACM, vol.20, no.5, pp.350-353, May 1977.
Algorithm::Diff::XS, the C/XS optimized version of Algorithm::Diff, which will be used automatically if available.
CAVEATS
KNOWN BUGS
There are no known bugs as of this release.
LIMITATIONS
It is not currently known whether Algorithm::Diff (Pure Perl version) and Algorithm::Diff::XS (C/XS implementation) produce the same output. The algorithms may not be equivalent (source code-wise) so they may produce different output under some as-yet-undiscovered conditions.
Any potential performance gains will be limited by those features implemented by Algorithm::Diff::XS. As of time of writing, this is limited to the
cdiff
subroutine.
QUALITY ASSURANCE METRICS
TEST COVERAGE
-------------------------- ------ ------ ------ ------ ------ ------
File stmt bran cond sub pod total
-------------------------- ------ ------ ------ ------ ------ ------
lib/Algorithm/Diff/Any.pm 100.0 100.0 100.0 100.0 100.0 100.0
Total 100.0 100.0 100.0 100.0 100.0 100.0
LICENSE
Copyright (C) 2009 by Jonathan Yu <jawnsy@cpan.org>
This package is distributed under the same terms as Perl itself. Please see the LICENSE file included in this distribution for full details of these terms.
DISCLAIMER OF WARRANTY
This software is provided by the copyright holders and contributors "AS IS" and ANY EXPRESS OR IMPLIED WARRANTIES, including, but not limited to, the IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
In no event shall the copyright owner or contributors be liable for any direct, indirect, incidental, special, exemplary or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage.