NAME
PHP::Strings - Implement some of PHP's string functions.
SYNOPSIS
use PHP::Strings;
my $slashed = addcslashes( $not_escaped, $charlist );
my $clean = strip_tags( $html, '<a><b><i><u>' );
my $unslashed = stripcslashes( '\a\b\f\n\r\xae' );
DESCRIPTION
PHP has many functions. This is one of the main problems with PHP.
People do, however, get used to said functions and when they come to a better designed language they get lost because they have to implement some of these somewhat vapid functions themselves.
So I wrote PHP::Strings
. It implements most of the strings functions of PHP. Those it doesn't implement it describes how to do in native Perl.
Any function that would be silly to implement has not been and has been marked as such in this documentation. They will still be exportable, but if you attempt to use said function you will get an error telling you to read these docs.
RELATED READING
"PHP in Contrast to Perl" http://tnx.nl/php.txt
"Experiences of Using PHP in Large Websites" by Aaron Crane, 2002 http://www.ukuug.org/events/linux2002/papers/html/php/
"PHP Annoyances" by Neil de Carteret, 2002 http://n3dst4.com/articles/phpannoyances/
"I hate PHP" by Keith Devens, 2003 http://keithdevens.com/weblog/archive/2003/Aug/13/HATE-PHP
"PHP: A love and hate relationship" by Ivan Ristic, 2002 http://www.webkreator.com/php/community/php-love-and-hate.html
"PHP Sucks" http://czth.net/pH/PHPSucks
Nathan Torkington's "list of PHP's shortcomings" http://nntp.x.perl.org/group/perl.advocacy/1458
ERROR HANDLING
All arguments are checked using Params::Validate. Bad arguments will cause an error to be thrown. If you wish to catch it, use eval
.
Attempts to use functions I've decided to not implement (as distinct from functions that aren't implemented because I've not gotten around to either writing or deciding whether to write) will cause an error displaying the documentation for said function.
EXPORTS
By default, nothing is exported.
Each function and constant can be exported by explicit name.
use PHP::Strings qw( str_pad addcslashes );
To get a function and its associated constants as well, prefix them with a colon:
use PHP::Strings qw( :str_pad );
# This grabs str_pad, STR_PAD_LEFT, STR_PAD_BOTH, STR_PAD_RIGHT.
To export everything:
use PHP::Strings qw( :all );
For more information on what you can add there, consult "Specialised Import Lists" in Exporter.
FUNCTIONS
addcslashes
http://www.php.net/addcslashes
Returns a string with backslashes before characters that are listed in $charlist
.
addslashes
PHP::Strings::addslashes WILL NOT BE IMPLEMENTED.
Returns a string with backslashes before characters that need to be quoted in SQL queries. You should never need this function. I mean, never.
DBI, the standard method of accessing databases with perl, does all this for you. It provides by a quote
method to escape anything, and it provides placeholders and bind values so you don't even have to worry about escaping. In PHP, PEAR DB also provides this facility.
DBI is also aware that some databases don't escape in this method, such as mssql which uses doubled characters to escape (like some versions of BASIC). This function doesn't.
The less said about PHP's magic_quotes
"feature", the better.
bin2hex
PHP::Strings::bin2hex WILL NOT BE IMPLEMENTED.
This is trivially implemented using pack.
my $hex = unpack "H*", $data;
chop
PHP::Strings::chop WILL NOT BE IMPLEMENTED.
PHP's chop
function is an alias to its "rtrim" function.
Perl has a builtin named chop. Thus we do not support the use of chop
as an alias to "rtrim".
chr
PHP::Strings::chr WILL NOT BE IMPLEMENTED.
PHP's and Perl's chr functions operate sufficiently identically.
Note that PHP's claims an ASCII value as input. Perl assumes Unicode. But ensure you see the documentation for a precise definition.
Note that it returns one character, which in some string encodings may not necessarily be one byte.
chunk_split
http://www.php.net/chunk_split
Returns the given string, split into smaller chunks.
my $split = chunk_split( $body [, $chunklen [, $end ] ] );
Where $body
is the data to split, $chunklen
is the optional length of data between each split (default 76), and $end
is what to insert both between each split (default "\r\n"
) and on the end.
Also trivially implemented as a regular expression:
$body =~ s/(.{$chunklen})/$1$end/sg;
$body .= $end;
convert_cyr_string
http://www.php.net/convert_cyr_string
PHP::Strings::convert_cyr_string WILL NOT BE IMPLEMENTED.
Perl has the Encode module to convert between character encodings.
count_chars
http://www.php.net/count_chars
A somewhat daft function that returns counts of characters in a string.
It's daft because it assumes characters have values in the range 0-255. This is patently false in today's world of Unicode. In fact, the PHP documentation for this function happily talks about characters in one part and bytes in another, not realising the distinction.
So, I've implemented this function as if it were called count_bytes
. It will count raw bytes, not characters.
Takes two arguments: the byte sequence to analyse and a 'mode' flag that indicates what sort of return value to return. The default mode is 0
.
Mode Return value
---- ------------
0 Return hash of byte values and frequencies.
1 As for 0, but hash does not contain bytes with frequency of 0.
2 As for 0, but hash only contains bytes with frequency of 0.
3 Return string composed of used byte-values.
4 Return string composed of unused byte-values.
my %freq = count_chars( $string, 1 );
crc32
TBD
crypt
PHP::Strings::crypt WILL NOT BE IMPLEMENTED.
PHP's crypt is the same as Perl's. Thus there's no need for PHP::String
to provide an implementation.
The CRYPT_*
constants are not provided.
echo
PHP::Strings::echo WILL NOT BE IMPLEMENTED.
See "print" in perlfunc.
explode
PHP::Strings::explode WILL NOT BE IMPLEMENTED.
Use the \Q
regex metachar and split.
my @pieces = split /\Q$separator/, $string, $limit;
See "split" in perlfunc for more details.
Note that split //
will split between every character, rather than returning false. Note also that split "..."
is the same as split /.../
which means to split everywhere three characters are matched. The first argument to split
is always a regex.
fprintf
PHP::Strings::fprintf WILL NOT BE IMPLEMENTED.
Perl's printf can be told to which file handle to print.
printf FILEHANDLE $format, @args;
See "printf" in perlfunc and "print" in perlfunc for details.
get_html_translation_table
http://www.php.net/get_html_translation_table
PHP::Strings::get_html_translation_table WILL NOT BE IMPLEMENTED.
Use the HTML::Entities module to escape and unescape characters.
hebrev
PHP::Strings::hebrev WILL NOT BE IMPLEMENTED.
Use the Encode module to convert between character encodings.
hebrevc
PHP::Strings::hebrevc WILL NOT BE IMPLEMENTED.
Use the Encode module to convert between character encodings.
html_entity_decode
http://www.php.net/html_entity_decode
PHP::Strings::html_entity_decode WILL NOT BE IMPLEMENTED.
Use the HTML::Entities module to decode character entities.
htmlentities
http://www.php.net/htmlentities
PHP::Strings::htmlentities WILL NOT BE IMPLEMENTED.
Use the HTML::Entities module to encode character entities.
htmlspecialchars
http://www.php.net/htmlspecialchars
PHP::Strings::htmlspecialchars WILL NOT BE IMPLEMENTED.
Use the HTML::Entities module to encode character entities.
implode
PHP::Strings::implode WILL NOT BE IMPLEMENTED.
See "join" in perlfunc. Note that join cannot accept its arguments in either order because that's just not how Perl arrays and lists work. Note also that the joining sequence is not optional.
join
PHP::Strings::join WILL NOT BE IMPLEMENTED.
PHP's join
is an alias for implode
. See "implode".
levenshtein
http://www.php.net/levenshtein
PHP::Strings::levenshtein WILL NOT BE IMPLEMENTED.
I have no idea why PHP has this function.
See Text::Levenshtein, Text::LevenshteinXS, String::Approx, Text::PHraseDistance and probably any number of other modules on CPAN.
ltrim
PHP::Strings::ltrim WILL NOT BE IMPLEMENTED.
As per perlfaq:
$string =~ s/^\s+//;
A basic glance through perlretut or perlreref should give you an idea on how to change what characters get trimmed.
md5
PHP::Strings::md5 WILL NOT BE IMPLEMENTED.
See Digest::MD5 which provides a number of functions for computing MD5 hashes from various sources and to various formats.
Note: the user notes for this function at http://www.php.net/md5 are among the most unintentionally funny and misinformed I've read.
md5_file
PHP::Strings::md5_file WILL NOT BE IMPLEMENTED.
The Digest::MD5 module provides sufficient support.
use Digest::MD5;
sub md5_file
{
my $filename = shift;
my $ctx = Digest::MD5->new;
open my $fh, '<', $filename or die $!;
binmode( $fh );
$ctx->addfile( $fh )->digest; # or hexdigest, or b64digest
}
Despite providing that possible implementation just above, I've chosen to not include it as an export due to the amount of flexibility of Digest::MD5 and the number of ways you may want to get your file handle. After all, you may want to use Digest::SHA1, or Digest::MD4 or some other digest mechanism.
Again, I wonder why PHP has the function as they so arbitrarily hobble it.
metaphone
PHP::Strings::metaphone WILL NOT BE IMPLEMENTED.
Text::Metaphone and Text::DoubleMetaphone and Text::TransMetaphone all provide metaphonic calculations.
money_format
http://www.php.net/money_format
sprintf for money.
nl2br
PHP::Strings::nl2br WILL NOT BE IMPLEMENTED.
This is trivially implemented as:
s,$,<br />,mg;
nl_langinfo
http://www.php.net/nl_langinfo
PHP::Strings::nl_langinfo WILL NOT BE IMPLEMENTED.
I18N::Langinfo has a langinfo
command that corresponds to PHP's nl_langinfo
function.
number_format
http://www.php.net/number_format
TBD
ord
PHP::Strings::ord WILL NOT BE IMPLEMENTED.
See "ord" in perlfunc. Note that Perl returns Unicode value, not ASCII.
parse_str
PHP::Strings::parse_str WILL NOT BE IMPLEMENTED.
See instead the CGI and URI modules which handles that sort of thing.
PHP::Strings::print WILL NOT BE IMPLEMENTED.
See "print" in perlfunc.
printf
PHP::Strings::printf WILL NOT BE IMPLEMENTED.
See "printf" in perlfunc.
quoted_printable_decode
http://www.php.net/quoted_printable_decode
PHP::Strings::quoted_printable_decode WILL NOT BE IMPLEMENTED.
MIME::QuotedPrint provides functions for encoding and decoding quoted-printable strings.
quotemeta
PHP::Strings::quotemeta WILL NOT BE IMPLEMENTED.
rtrim
PHP::Strings::rtrim WILL NOT BE IMPLEMENTED.
Another trivial regular expression:
$string =~ s/\s+$//;
See the notes on "ltrim".
setlocale
PHP::Strings::setlocale WILL NOT BE IMPLEMENTED.
setlocale
is provided by the POSIX module.
sha1
PHP::Strings::sha1 WILL NOT BE IMPLEMENTED.
See "md5", mentally substituting Digest::SHA1 for Digest::MD5, although the user notes are not as funny.
sha1_file
PHP::Strings::sha1_file WILL NOT BE IMPLEMENTED.
See "md5_file"
similar_text
http://www.php.net/similar_text
TBD
soundex
PHP::Strings::soundex WILL NOT BE IMPLEMENTED.
See Text::Soundex, which also happens to be a core module.
sprintf
PHP::Strings::sprintf WILL NOT BE IMPLEMENTED.
sscanf
PHP::Strings::sscanf WILL NOT BE IMPLEMENTED.
This is a godawful function. You should be using regular expressions instead. See perlretut and perlre.
str_ireplace
http://www.php.net/str_ireplace
PHP::Strings::str_ireplace WILL NOT BE IMPLEMENTED.
Use the s///
operator instead. See perlop and perlre for details.
str_pad
TBD
str_repeat
PHP::Strings::str_repeat WILL NOT BE IMPLEMENTED.
Instead, use the x
operator. See perlop for details.
my $by_ten = "-=" x 10;
str_replace
http://www.php.net/str_replace
PHP::Strings::str_replace WILL NOT BE IMPLEMENTED.
See the s///
operator. perlop and perlre have details.
str_rot13
PHP::Strings::str_rot13 WILL NOT BE IMPLEMENTED.
This is rather trivially implemented as:
$message =~ tr/A-Za-z/N-ZA-Mn-za-m/
(As per "Programming Perl", 3rd edition, section 5.2.4.)
str_shuffle
http://www.php.net/str_shuffle
Implemented, against my better judgement. It's trivial, like so many of the others.
str_split
PHP::Strings::str_split WILL NOT BE IMPLEMENTED.
See "split" in perlfunc for details.
my @bits = split /(.{,$len})/, $string;
str_word_count
http://www.php.net/str_word_count
TBD
strcasecmp
PHP::Strings::strcasecmp WILL NOT BE IMPLEMENTED.
Equivalent to:
lc($a) cmp lc($b)
strchr
PHP::Strings::strchr WILL NOT BE IMPLEMENTED.
See "strstr"
strcmp
PHP::Strings::strcmp WILL NOT BE IMPLEMENTED.
Equivalent to:
$a cmp $b
strcoll
PHP::Strings::strcoll WILL NOT BE IMPLEMENTED.
Equivalent to:
use locale;
$a cmp $b
strcspn
PHP::Strings::strcspn WILL NOT BE IMPLEMENTED.
Trivially equivalent to:
my $cspn;
$cspn = $-[0]-1 if $string =~ m/[chars]/;
strip_tags
You really want HTML::Scrubber.
This function tries to return a string with all HTML tags stripped from a given string. It errors on the side of caution in case of incomplete or bogus tags.
You can use the optional second parameter to specify tags which should not be stripped.
For more control, use HTML::Scrubber.
stripcslashes
http://www.php.net/stripcslashes
Returns a string with backslashes stripped off. Recognizes C-like \n
, \r
..., octal and hexadecimal representation.
stripos
PHP::Strings::stripos WILL NOT BE IMPLEMENTED.
Trivially implemented as:
my $pos = index( lc $haystack, lc $needle );
my $second = index( lc $haystack, lc $needle, $pos );
Note that unlike stripos
, index
returns -1
if $needle
is not found. This makes testing much simpler.
If you want the additional behaviour of non-strings being converted to integers and from there to characters of that value, then you're silly. If you want to find a character of particular value, explicitly use the chr
function:
my $charpos = index( lc $haystack, lc chr $char );
stripslashes
http://www.php.net/stripslashes
PHP::Strings::stripslashes WILL NOT BE IMPLEMENTED.
If you can think of a good reason for this function, you have more imagination than I do.
stristr
PHP::Strings::stristr WILL NOT BE IMPLEMENTED.
Use substr() and index() instead.
my $strstr = substr( $haystack, index( lc $haystack, lc $needle ) );
Or a regex:
my ( $strstr ) = $haystack =~ /(\Q$needle\E.*$)/si;
strlen
PHP::Strings::strlen WILL NOT BE IMPLEMENTED.
See "length" in perldoc.
strnatcasecmp
http://www.php.net/strnatcasecmp
PHP::Strings::strnatcasecmp WILL NOT BE IMPLEMENTED.
See Sort::Naturally.
strnatcmp
PHP::Strings::strnatcmp WILL NOT BE IMPLEMENTED.
See Sort::Naturally.
strncasecmp
http://www.php.net/strncasecmp
PHP::Strings::strncasecmp WILL NOT BE IMPLEMENTED.
Unnecessary. Perl is smart enough. Use substr.
strncmp
PHP::Strings::strncmp WILL NOT BE IMPLEMENTED.
Unnecessary. Perl is smart enough. Use substr.
strpos
PHP::Strings::strpos WILL NOT BE IMPLEMENTED.
This function is Perl's index function, however index
has a sensible return value.
strrchr
PHP::Strings::strrchr WILL NOT BE IMPLEMENTED.
See "rindex" in perlfunc. Note that all characters in the $needle
are used: if you just want to find the first character, then extract it.
strrev
PHP::Strings::strrev WILL NOT BE IMPLEMENTED.
See "reverse" in perlfunc. Note the note about scalar context.
my $derf = reverse "fred";
print scalar reverse "fred";
strripos
PHP::Strings::strripos WILL NOT BE IMPLEMENTED.
This is just getting silly.
strrpos
PHP::Strings::strrpos WILL NOT BE IMPLEMENTED.
See rindex.
strstr
PHP::Strings::strstr WILL NOT BE IMPLEMENTED.
Use substr() and index() instead.
my $strstr = substr( $haystack, index( $haystack, $needle ) );
Or a regex:
my ( $strstr ) = $haystack =~ /(\Q$needle\E.*$)/s;
FUNCTIONS ACTUALLY IMPLEMENTED
Just in case you missed which functions were actually implemented in that huge mass of unimplemented functions, here's the condensed list of implemented functions:
BAD EGGS
All functions that I think are worthless are still exportable, with the exception of any that would clash with a Perl builtin function.
If you try to actually use said function, a big fat error will result.
FOR THOSE WHO HAVE READ THIS FAR
Yes, this module is mostly a joke. I wrote a lot of it after being asked for the hundredth time: What's the equivalent to PHP's X in Perl?
That said, although it's a joke, I'm happy to receive amendments, additions and such. It's incomplete at present, and I would like to see it complete at some point.
In particular, the test suite needs a lot of work. (If you feel like it. Hint Hint.)
If you want to implement some of the functions that I've said will not be implemented, then I'll be happy to include them. After all, what I think is worthless is my opinion.
BUGS, REQUESTS, COMMENTS
Log them via the CPAN RT system via the web or email:
http://rt.cpan.org/NoAuth/ReportBug.html?Queue=PHP-Strings
( shorter URL: http://xrl.us/4at )
bug-php-strings@rt.cpan.org
This makes it much easier for me to track things and thus means your problem is less likely to be neglected.
THANKS
Juerd Waalboer (JUERD) for suggesting a link, and the assorted regex functions.
Matthew Persico (PERSICOM) for the idea of having the functions give their documentation as their error.
LICENCE AND COPYRIGHT
PHP::Strings is copyright © Iain Truskett, 2003. All rights reserved.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.000 or, at your option, any later version of Perl 5 you may have available.
The full text of the licences can be found in the Artistic and COPYING files included with this module, or in perlartistic and perlgpl as supplied with Perl 5.8.1 and later.
AUTHOR
Iain Truskett <spoon@cpan.org>
SEE ALSO
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 2435:
Non-ASCII character seen before =encoding in 'façade'. Assuming UTF-8