NAME
URL::RegexMatching - A library of utility methods for matching URLs with regex patterns.
SYNOPSIS
#!/usr/bin/perl
use strict;
use warnings;
use URL::RegexMatching qw(url_match_regex http_url_match_regex);
my $text = <<SAMPLE;
This is some sample text with links like
<http://foo.com/blah_blah/> and others like WWW.EXAMPLE.COM
and bit.ly/foo. And what about something like a
mailto:name\@example.com pattern?
SAMPLE
my $url_regex = url_match_regex;
my $http_regex = http_url_match_regex;
print "Using this sample text:\n";
print "$text\n";
print "These strings are probably links:\n";
while ($text =~m{$url_regex}g) {
print "\t$1\n";
}
print "\nWeb URLs:\n";
while ($text =~m{$http_regex}g) {
print "\t$1\n";
}
$text =~s{$http_regex}{<a href="$1">$1</a>}g;
print "\n\n";
print "Convert only HTTP links to HTML links using http_url_match_regex:\n";
print "$text\n";
DESCRIPTION
This package is based on regular expression patterns initially developed by John Gruber of Daring Fireball fame. This module is simply a packaging of his work to make utilization by the Perl community easier.
METHODS
url_match_regex
This method takes no arguments and returns a compiled regular expression matching pattern. The pattern will liberally match string that appear to be various HTTP, HTTPS and mailto including a best attempt to identify relative URLs.
This method can be exported by request.
http_url_match_regex
This method takes no arguments and returns a compiled regular expression matching pattern. This pattern will liberally match only web URLs -- http, https and relative forms such as www.example.com
This method can be exported by request.
KNOWN ISSUES
Both regular expression patterns are known to fail against URL strings such as:
When using the http_url_match_regex
method it is likely to match link strings whose domain/file path looks like a web URL, but uses a different protocol such as 'ftp://www.example.com/foo.txt' where the match would capture all but the 'ftp://' part.
SUPPORT
Bugs should be reported via the GitHub project issues tracking system: http://github.com/tima/perl-url-regexmatching/issues
AUTHOR
Timothy Appnel <tima@cpan.org>
SEE ALSO
http://daringfireball.net/2010/07/improved_regex_for_matching_urls
COPYRIGHT AND LICENCE
This module is based on the work of John Gruber of Daring Fireball. John writes "this pattern is free for anyone to use, no strings attached. Consider it public domain."
The software is released under the Artistic License. The terms of the Artistic License are described at http://www.perl.com/language/misc/Artistic.html.
Except where otherwise noted, URL::RegexMatching is Copyright 2010, Timothy Appnel, tima@cpan.org. All rights reserved.
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 100:
Non-ASCII character seen before =encoding in 'http://example.com/quotes-are-“part”'. Assuming UTF-8