The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

Name

SPVM::Regex - Regular Expressions

Description

The Regex class of SPVM has methods for regular expressions.

Google RE2 is used as the regular expression library.

Usage

use Regex;

# Pattern match
{
  my $re = Regex->new("ab*c");
  my $string = "zabcz";
  my $match = $re->match("zabcz");
}

# Pattern match - UTF-8
{
  my $re = Regex->new("あ+");
  my $string = "いあああい";
  my $match = $re->match($string);
}

# Pattern match - Character class and the nagation
{
  my $re = Regex->new("[A-Z]+[^A-Z]+");
  my $string = "ABCzab";
  my $match = $re->match($string);
}

# Pattern match with captures
{
  my $re = Regex->new("^(\w+) (\w+) (\w+)$");
  my $string = "abc1 abc2 abc3";
  my $match = $re->match($string);
  
  if ($match) {
    my $cap1 = $match->cap1;
    my $cap2 = $match->cap2;
    my $cpa3 = $match->cap3;
  }
}

# Replace
{
  my $re = Regex->new("abc");
  my $string = "ppzabcz";
  
  # "ppzABCz"
  my $result = $re->replace($string, "ABC");
}

# Replace with a callback and capture
{
  my $re = Regex->new("a(bc)");
  my $string = "ppzabcz";
  
  # "ppzABbcCz"
  my $result = $re->replace($string, method : string ($re : Regex, $match : Regex::Match) {
    return "AB" . $match->cap1 . "C";
  });
}

# Replace global
{
  my $re = Regex->new("abc");
  my $string = "ppzabczabcz";
  
  # "ppzABCzABCz"
  my $result = $re->replace_g($string, "ABC");
}

# Replace global with a callback and capture
{
  my $re = Regex->new("a(bc)");
  my $string = "ppzabczabcz";
  
  # "ppzABCbcPQRSzABCbcPQRSz"
  my $result = $re->replace_g($string, method : string ($re : Regex, $match : Regex::Match) {
    return "ABC" . $match->cap1 . "PQRS";
  });
}

# . - single line mode
{
  my $re = Regex->new("(.+)", "s");
  my $string = "abc\ndef";
  
  my $match = $re->match($string);
  
  unless ($match) {
    return 0;
  }
  
  unless ($match->cap1 eq "abc\ndef") {
    return 0;
  }
}

Dependent Resources

Regular Expression Syntax

Google RE2 Syntax

Fields

captures

has captures : ro string[];

The captured strings.

This field is deprecated and will be removed.

match_start

has match_start : ro int;

The start offset of the matched string.

This field is deprecated and will be removed.

match_length

has match_length : ro int;

The length of the matched string.

This field is deprecated and will be removed.

replaced_count

has replaced_count : ro int;

The replaced count.

This field is deprecated and will be removed.

Class Methods

new

static method new : Regex ($pattern : string, $flags : string = undef)

Creates a new Regex object and compiles the regex pattern $pattern with the flags $flags, and retruns the created object.

my $re = Regex->new("^ab+c");
my $re = Regex->new("^ab+c", "s");

Instance Methods

match

method match : Regex::Match ($string : string, $offset : int = 0, $length : int = -1);

The alias for the following match_forward method.

my $ret = $self->match_forward($string, \$offset, $length);

match_forward

method match_forward : Regex::Match ($string : string, $offset_ref : int*, $length : int = -1);

Performs pattern matching on the substring from the offset $$offset_ref to the length $length of the string $string.

The $$offset_ref is updated to the next position.

If the pattern matching is successful, returns a Regex::Match object. Otherwise returns undef.

Exceptions:

The $string must be defined. Otherwise an exception is thrown.

The $offset + the $length must be less than or equal to the length of the $string. Otherwise an exception is thrown.

If the regex is not compiled, an exception is thrown.

replace

method replace  : string ($string : string, $replace : object of string|Regex::Replacer, $offset : int = 0, $length : int = -1, $options : object[] = undef)

The alias for the following replace_common method.

my $ret = $self->replace_common($string, $replace, \$offset, $length, $options);

replace_g

method replace_g  : string ($string : string, $replace : object of string|Regex::Replacer, $offset : int = 0, $length : int = -1, $options : object[] = undef)

The alias for the following replace_common method.

unless ($options) {
  $options = {};
}
$options = Fn->merge_options({global => 1}, $options);
return $self->replace_common($string, $replace, \$offset, $length, $options);

replace_common

method replace_common : string ($string : string, $replace : object of string|Regex::Replacer,
  $offset_ref : int*, $length : int = -1, $options : object[] = undef);

Replaces the substring from the offset $$offset_ref to the length $length of the string $string with the replacement string or callback $replace with the options $options.

If the $replace is a Regex::Replacer object, the return value of the callback is used for the replacement.

Options:

  • global

    This option must be a Int object. Otherwise an exception is thrown.

    If the value of the Int object is a true value, the global replacement is performed.

  • info

    This option must be an array of the Regex::ReplaceInfo object. Otherwise an exception is thrown.

    If this option is specifed, the first element of the array is set to a Regex::ReplaceInfo object of the replacement result.

Exceptions:

The $string must be defined. Otherwise an exception is thrown.

The $replace must be a string or a Regex::Replacer object. Otherwise an exception is thrown.

The $offset must be greater than or equal to 0. Otherwise an exception is thrown.

The $offset + the $length must be less than or equal to the length of the $string. Otherwise an exception is thrown.

Exceptions of the match_forward method can be thrown.

split

method split : string[] ($string : string, $limit : int = 0);

The same as the split method in the Fn class, but the regular expression is used as the separator.

buffer_match

method buffer_match : Regex::Match ($string_buffer : StringBuffer, $offset : int = 0, $length : int = -1);

The same as "match", but the first argument is a StringBuffer object, and the following excetpions are thrown.

Exceptions:

The $offset + $length must be less than or equalt to the lenght of the $string_buffer. Otherwise an exception is thrown.

buffer_match_forward

method buffer_match_forward : Regex::Match ($string_buffer : StringBuffer, $offset_ref : int*, $length : int = -1);

The same as "match_forward", but the first argument is a StringBuffer object, and the following excetpions are thrown.

Exceptions:

The $offset + $length must be less than or equalt to the lenght of the $string_buffer. Otherwise an exception is thrown.

buffer_replace

method buffer_replace  : void ($string_buffer : StringBuffer, $replace : object of string|Regex::Replacer, $offset : int = 0, $length : int = -1, $options : object[] = undef);

The same as "replace", but the first argument is a StringBuffer object, and the return type is void.

The replacement is performed on the string buffer.

buffer_replace_g

method buffer_replace_g  : string ($string_buffer : StringBuffer, $replace : object of string|Regex::Replacer, $offset : int = 0, $length : int = -1, $options : object[] = undef);

The same as "replace_g", but the first argument is a StringBuffer object, and the return type is void.

The replacement is performed on the string buffer.

buffer_replace_common

method buffer_replace_common : void ($string_buffer : StringBuffer, $replace : object of string|Regex::Replacer, $offset_ref : int*, $length : int = -1, $options : object[] = undef);

The same as "replace_common", but the first argument is a StringBuffer object, and the return type is void.

The replacement is performed on the string buffer.

cap1

method cap1 : string ();

The alias for $re->captures->[1].

This method is deprecated and will be removed.

cap2

method cap2 : string ();

The alias for $re->captures->[2].

This method is deprecated and will be removed.

cap3

method cap3 : string ();

The alias for $re->captures->[3].

This method is deprecated and will be removed.

cap4

method cap4 : string ();

The alias for $re->captures->[4].

This method is deprecated and will be removed.

cap5

method cap5 : string ();

The alias for $re->captures->[5].

This method is deprecated and will be removed.

cap6

method cap6 : string ();

The alias for $re->captures->[6].

This method is deprecated and will be removed.

cap7

method cap7 : string ();

The alias for $re->captures->[7].

This method is deprecated and will be removed.

cap8

method cap8 : string ();

The alias for $re->captures->[8].

This method is deprecated and will be removed.

cap9

method cap9 : string ();
The alias for C<$re-E<gt>captures-E<gt>[9]>.

This method is deprecated and will be removed.

cap10

method cap10 : string ();
The alias for C<$re-E<gt>captures-E<gt>[10]>.

This method is deprecated and will be removed.

cap11

method cap11 : string ();

The alias for $re->captures->[11].

This method is deprecated and will be removed.

cap12

method cap12 : string ();

The alias for $re->captures->[12].

This method is deprecated and will be removed.

cap13

method cap13 : string ();

The alias for $re->captures->[13].

This method is deprecated and will be removed.

cap14

method cap14 : string ();

The alias for $re->captures->[14].

This method is deprecated and will be removed.

cap15

method cap15 : string ();

The alias for $re->captures->[15].

This method is deprecated and will be removed.

cap16

method cap16 : string ();

The alias for $re->captures->[16].

This method is deprecated and will be removed.

cap17

method cap17 : string ();

The alias for $re->captures->[17].

This method is deprecated and will be removed.

cap18

method cap18 : string ();

The alias for $re->captures->[18].

This method is deprecated and will be removed.

cap19

method cap19 : string ();

The alias for $re->captures->[19].

This method is deprecated and will be removed.

cap20

method cap20 : string ();

The alias for $re->captures->[20].

This method is deprecated and will be removed.

Repository

SPVM::Regex - Github

Author

Yuki Kimoto

Contributors

Copyright & License

Copyright (c) 2023 Yuki Kimoto

MIT License