NAME
Finance::MICR::LineParser - validate and parse a check MICR string
SYNOPSIS
use Finance::MICR::LineParser;
my $micr = Finance::MICR::LineParser->new({ string => $string });
print "Is this a MICR code? ". $micr->valid;
Imagine you scanned in a check using a standard scanner. And used some OCR sofware to try to extract the text from it. It could have a miriad problems, garble, etc - but it's what we have to work with. So.. let's create a small cli script that takes potentially garble and tells us if a MICR code is there and something about it.
micrline.pl:
#!/usr/bin/perl -w
use strict;
use Finance::MICR::LineParser;
my $string = $ARGV[0];
$string or die('missing arg');
my $micr = new Finance::MICR::LineParser({ string => $string });
if ($micr->valid){
print "A valid MICR line is present: ".$micr->micr."\n";
print "The type of check is: ".$micr->get_check_type."\n";
print "The routing number is: ".$micr->routing_number."\n";
print "The check number is: ".$micr->check_number."\n";
print "Status: ".$micr->status;
}
elsif ($micr->is_unknown_check){
print "I don't see a full valid MICR line here, but this is what I can match up "
."if this is a business check: ". $micr->micr."\n";
print "Status: ".$micr->status;
}
else {
print "This is garble to me.\n";
print "Status: ".$micr->status;
}
Now in your terminal:
# perl ./micrline.pl U2323424U_T234244T_2342424U
DESCRIPTION
Parse a MICR Line code into parts. Additionally tell us if a string garble contains a MICR code. If you have a string and want to parse it as a check's MICR line, this is useful.
I am presently using this module to let the office scan in documents and using gocr, I get a string out of the scanned check image. Then with this module I parse the MICR line- if one is there. I name the documents for archiving after the MICR code.
Obviously with scanning, the MICR symbols don't have unicode equivalents- so various companies have switched the symbols for alpha counterparts. This module accepts the symbols being more then one character. This is beacuse gocr can't group something like '||"' into one character. You may have trained your ocr software to replace those with something like Tt (transit, which looks like |:) and UUu (on us, which looks like ||"). This module can be told on instantiation, that the symbols are something other then the defaults. For example, I trained my gocr to change ||" to CCc and |: to Aa - so I start an object instance like so:
my $micr = new Finance::MICR::LineParser({
string => $string_from_gocr,
on_us_symbol => 'CCc',
transit_symbol => 'Aa',
dash_symbol => 'DDd',
ammount_symbol => 'XxX',
});
By default, these are changed to :
Transit Symbol: T
Ammount Symbol: X
On-Us Symbol: U
Dash Symbol: D
That is, when you query methods such as $micr->on_us, the return on_us value therein is U and not CCc.
As of this time, if you want to change the symbols back to something else, it's up to you to handle the output.
METHODS
new()
Argument is anon hash. croaks if no arg provided. Right now takes a string and tries to find MICR parts..
my $m = new Finance::MICR::LineParser ({ string => 'U2323424U_T234244T_2342424U' });
Constructor Arguments:
string: the string you have that you think *is*, or may *contain* a micr string.
valid()
Ask if the MICR code is valid Returns true or undef Valid means the string argument was matched as a business check MICR or a personal check MICR. That is, the fields are there and in the *right order*. NOTE that if your code is deemed invalid, you *may* still get field values. But your string as a whole should be considered invalid. You should always use valid() before taking the output as gospel.
status()
Returns a summary string including original string argument to constructor, "clean_run" pass count for the string, string after those clean runs, if the module gave up, etc. Useful for logging and find out if you have any problems.
Typical usage:
$micr->valid or print STDERR $micr->status;
is_business_check()
Returns true or false. Presently a business check has the fields;
AUXILIARY_ON_US TRANSIT ON_US
In that order. Furthermore, the check number is extracted from AUXILIARY_ON_US
is_personal_check()
Returns true or false. Presently a business check has the fields;
TRANSIT ON_US
In that order. Furthermore the check number is extracted from ON_US, digits after the on us symbol.
is_unknown_check()
Means that we have matched one or more main fields (aux on us, on us, transit) but some are missing or in unexpected order. This should be taken *very* seriously. It means any strings that return is_unknown_check() *must* be checked for correctness.
get_check_type()
Returns (u)nknown, (b)usiness, (p)ersonal, or undef.
clean_runs()
How many times the string has been been "cleaned". This does not tell you that the string is a valid MICR code. Just how many times it was cleaned. The higher the number, the more you should inspect the output by a human being.
original_string()
String passed to constructor.
micr()
MICR string without spaces of extraneous garble. If you passed a MICR string *with* garble, this is different from the original_string() Returns undef if the string is invalid. NOTE: a string which is not valid() will not return a micr() code.
micr_pretty()
Returns the micr() code somewhat formatted for human eyes. That is.. If your original string argument to the constructror is
3 12 U0000011135U T052000113T 984U0837166 _ 23 1
Then this returns somethign like
U0000011135U_T052000113T_984U0837166
NOTE: a string which is not valid() will not return a micr_pretty() code.
giveup()
Returns true or false. Think of this also as 'gave up'. NOTE: A string that ended up not valid() could still return 0 here. This is because by default, Finance::MICR::LineParser attempts to match at least one of the main MICR fields before giving up.
MICR SPECIFIC METHODS
There are five major fields on a MICR line. Two of the five major fields (transit and "on us") are broken into multiple fields- here called "sub fields". First are the five major fields...
auxiliary_on_us()
contains check number if present; bracketed by 'on us' symbols returns undef if not found.
epc()
one character located to the left of the transit field if present returns undef if not found. This needs work.
transit()
Always 9 digits including check digit. Opens and closes with a transit symbol. (Some papers refer to this field as having 11 chars because they are counting the open and close symbols as characters.) returns undef if not found.
on_us()
variable length 19 digits max between transit and amount fields (to the right of transit.) returns undef if not found.
ammount()
10 digits zero filled; bracketed by two amount symbols returns undef if not found. This needs work.
TRANSIT SUB FIELD METHODS
Transit has 9 digits. It is croken into multiple fields:
routing_number()
return routing number. (digits 1-4) returns undef if not found.
bank_number()
return bank number (digits 5-8) returns undef if not found.
check_digit()
return check digit (one digit) returns undef if not found
ON US SUB FIELD METHODS
check_number()
returns check number, Located in various places in the on us field. returns undef if not found
tpc()
max 6 characters; Located to right of account number returns undef if not found TODO: This needs some thought, on a personal check this would be the check number, what gives?
account_number()
Variable length; always followed by the On Us symbol returns undef if not found
BUGS
Please report bugs to developer.
Notice: this module is under development. It is being used for production, but it *is* under development.
Please notify with hany questions or concerns. I've seen very little on MICR and open source out there. If you have any recommendations, please don't hessitate on letting me know how to make this module better.
This module helps me a lot, and I am hoping it may be of use to others and they may contribute criticism, patches, suggestions, etc.
TODO
Not yet implemented:
If you want to get *your* symbols output back, here's an example:
my $micr = new Finance::MICR::LineParser({
string => $string_from_gocr,
on_us_symbol => 'CCc',
transit_symbol => 'Aa',
dash_symbol => 'DDd',
ammount_symbol => 'XxX',
return_my_symbols=>1,
});
BUGS
Address bug reports and comments to AUTHOR
SEE ALSO
http://en.wikipedia.org/wiki/Magnetic_ink_character_recognition
AUTHOR
Leo Charre leocharre at cpan dot org
COPYRIGHT
Copyright (c) 2009 Leo Charre. All rights reserved.
LICENSE
This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, i.e., under the terms of the "Artistic License" or the "GNU General Public License".
DISCLAIMER
This package is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
See the "GNU General Public License" for more details.