NAME

MarpaX::ESLIF::RegexCallout - ESLIF Regex Callout

VERSION

version 6.0.11

SYNOPSIS

package MyRecognizerInterface;
use Data::Dumper;

sub new                    { bless { data => $_[1] }, $_[0] }
sub read                   { 1 }
sub isEof                  { 1 }
sub isCharacterStream      { 1 }
sub encoding               { }
sub data                   { $_[0]->{data} }
sub isWithDisableThreshold { 1 }
sub isWithExhaustion       { 0 }
sub isWithNewline          { 1 }
sub isWithTrack            { 0 }

sub do_regexCallout {
    my ($self, $regexCallout) = @_;
    print STDERR "Regex callout: " . Dumper($regexCallout);
    return 0;
}

1;

package MyValueInterface;

sub new                    { bless { result => undef }, $_[0] }
sub isWithHighRankOnly     { 1 }
sub isWithOrderByRank      { 1 }
sub isWithAmbiguous        { 0 }
sub isWithNull             { 0 }
sub maxParses              { 0 }
sub setResult              { $_[0]->{result} = $_[1] }
sub getResult              { $_[0]->{result} }

1;

package main;
use MarpaX::ESLIF;

my $eslif = MarpaX::ESLIF->new();
my $data = do { local $/; <DATA> };
my $eslifGrammar = MarpaX::ESLIF::Grammar->new($eslif, $data);
foreach (qw/123XX XX/) {
    my $recognizerInterface = MyRecognizerInterface->new($_);
    my $valueInterface      = MyValueInterface->new();
    $eslifGrammar->parse($recognizerInterface, $valueInterface);
    print STDERR "Value: " . $valueInterface->getResult() . "\n";
}

__DATA__
#
# This is an example of a calculator grammar
#
:default ::= regex-action  => do_regexCallout

topRule ::= /[\d]+(?C"Digits")(.*)(?C"Rest")/
topRule ::= /X+(?C123)/

DESCRIPTION

ESLIF Regex Callout.

Regular expression callbacks have an argument that is hash blessed to this package. Regular expression callback is a grammar setting using the ::default meta rule, e.g.:

:default ::= regex-action  => do_regexCallout

where the callback, do_regexCallout here, must reside in the recognizer interface. Callouts are writen as per PCRE2 syntax, e.g.:

someRule ::= /[\d]+(?C"CallbackIdentifier as a string")/
someRule ::= /X(?C123)/

The callout function is interpreted as an integer, whose value conforms to PCRE2 specification at https://www.pcre.org/current/doc/html/pcre2callout.html:

If the value is zero, matching proceeds as normal
If the value is greater than zero, matching fails at the current point, but the testing of other matching possibilities goes ahead, just as if a lookahead assertion had failed.
If the value is less than zero, the match is abandoned, and the matching function returns the negative value.

ESLIF prevents negative values to be lower than the most negative meaningful value. Exhaustive current list is as of PCRE2 version 10.33, i.e.:

MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_NOMATCH (-1)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_PARTIAL (-2)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UTF8_ERR1 (-3)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UTF8_ERR2 (-4)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UTF8_ERR3 (-5)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UTF8_ERR4 (-6)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UTF8_ERR5 (-7)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UTF8_ERR6 (-8)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UTF8_ERR7 (-9)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UTF8_ERR8 (-10)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UTF8_ERR9 (-11)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UTF8_ERR10 (-12)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UTF8_ERR11 (-13)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UTF8_ERR12 (-14)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UTF8_ERR13 (-15)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UTF8_ERR14 (-16)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UTF8_ERR15 (-17)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UTF8_ERR16 (-18)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UTF8_ERR17 (-19)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UTF8_ERR18 (-20)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UTF8_ERR19 (-21)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UTF8_ERR20 (-22)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UTF8_ERR21 (-23)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UTF16_ERR1 (-24)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UTF16_ERR2 (-25)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UTF16_ERR3 (-26)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UTF32_ERR1 (-27)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UTF32_ERR2 (-28)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_BADDATA (-29)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_MIXEDTABLES (-30)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_BADMAGIC (-31)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_BADMODE (-32)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_BADOFFSET (-33)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_BADOPTION (-34)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_BADREPLACEMENT (-35)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_BADUTFOFFSET (-36)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_CALLOUT (-37)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_DFA_BADRESTART (-38)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_DFA_RECURSE (-39)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_DFA_UCOND (-40)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_DFA_UFUNC (-41)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_DFA_UITEM (-42)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_DFA_WSSIZE (-43)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_INTERNAL (-44)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_JIT_BADOPTION (-45)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_JIT_STACKLIMIT (-46)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_MATCHLIMIT (-47)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_NOMEMORY (-48)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_NOSUBSTRING (-49)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_NOUNIQUESUBSTRING (-50)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_NULL (-51)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_RECURSELOOP (-52)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_DEPTHLIMIT (-53)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_RECURSIONLIMIT (-53)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UNAVAILABLE (-54)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_UNSET (-55)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_BADOFFSETLIMIT (-56)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_BADREPESCAPE (-57)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_REPMISSINGBRACE (-58)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_BADSUBSTITUTION (-59)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_BADSUBSPATTERN (-60)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_TOOMANYREPLACE (-61)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_BADSERIALIZEDDATA (-62)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_HEAPLIMIT (-63)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_CONVERT_SYNTAX (-64)
MarpaX::ESLIF::RegexCallout::PCRE2_ERROR_INTERNAL_DUPMATCH (-65)

Any value lower than PCRE2_ERROR_INTERNAL_DUPMATCH will emit a warning by ESLIF, the later changing it to PCRE2_ERROR_CALLOUT.

METHODS

$self->getCalloutNumber

Returns callout number or undef

$self->getCalloutString

Returns callout string or undef

$self->getSubject

Returns current subject. Always undef unless ESLIF is compiled in trace mode.

$self->getPattern

Returns pattern. Always undef unless ESLIF is compiled in trace mode.

$self->getCaptureTop

Returns the max recent capture

$self->getCaptureLast

Returns the most recently closed capture

$self->getOffsetVector

Returns a reference to an array containing offsets

$self->getMark

Returns the most recently passed NAME of a (*MARK:NAME), (*PRUNE:NAME) or (*THEN:NAME) item in the match, undef if none.

$self->getStartMatch

Returns the current mark start attempt offset

$self->getCurrentPosition

Returns the current subject offset

$self->getNextItem

Returns the next item in the pattern

$self->getGrammarLevel

Returns the current grammar level

$self->getSymbolId

Returns the current symbol id

SEE ALSO

PCRE2 Callout Specification

AUTHOR

Jean-Damien Durand <jeandamiendurand@free.fr>

COPYRIGHT AND LICENSE

This software is copyright (c) 2017 by Jean-Damien Durand.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.