NAME
Language::FormulaEngine - Parser/Interpreter/Compiler for simple spreadsheet formula language
VERSION
version 0.07
SYNOPSIS
my $vars= { foo => 1, bar => 3.14159265358979, baz => 42 };
my $engine= Language::FormulaEngine->new();
$engine->evaluate( 'if(foo, round(bar, 3), baz*100)', $vars );
# or for more speed on repeat evaluations
my $formula= $engine->compile( 'if(foo, round(bar, 3), baz*100)' );
print $formula->($vars);
package MyNamespace {
use Moo;
extends 'Language::FormulaEngine::Namespace::Default';
sub fn_customfunc { print "arguments are ".join(', ', @_)."\n"; }
};
my $engine= Language::FormulaEngine->new(namespace => MyNamespace->new);
my $formula= $engine->compile( 'CustomFunc(baz,2,3)' );
$formula->($vars); # prints "arguments are 42, 2, 3\n"
DESCRIPTION
This set of modules implement a parser, evaluator, and optional code generator for a simple expression language similar to those used in spreadsheets. The intent of this module is to help you add customizable behavior to your applications that an "office power-user" can quickly learn and use, while also not opening up security holes in your application.
In a typical business application, there will always be another few use cases that the customer didn't know about or think to tell you about, and adding support for these use cases can result in a never-ending expansion of options and chekboxes and dropdowns, and a lot of time spent deciding the logical way for them to all interact. One way to solve this is to provide some scripting support for the customer to use. However, you want to make the language easy to learn, "nerfed" enough for them to use safely, and prevent security vulnerabilities. The challenge is finding a language that they find familiar, that is easy to write correct programs with, and that dosn't expose any peice of the system that you didn't intend to expose. I chose "spreadsheet formula language" for a project back in 2012 and it worked out really well, so I decided to give it a makeover and publish it.
The default syntax is pure-functional, in that each operation has exactly one return value, and cannot modify variables; in fact none of the default functions have any side-effects. There is no assignment, looping, or nested data structures. The language does have a bit of a Perl twist to it's semantics, like throwing exceptions rather than returning #VALUE!
, fluidly interpreting values as strings or integers, and using DateTime instead of days-since-1900 numbers for dates, but most users probably won't mind. And, all these decisions are fairly easy to change with a subclass. (but if you want big changes, you should review your options to make sure you're starting with the right module.)
The language is written with security in mind, and (until you start making changes) should be safe for most uses, since the functional design promotes O(1) complexity and shouldn't have side effects on the data structures you expose to the user. The optional "compile" method does use eval
though, so you should do an audit for yourself if you plan to use it where security is a concern.
Features:
Standard design with scanner/parser, syntax tree, namespaces, and compiler.
Can compile to perl coderefs for fast repeated execution
Provides metadata about what it compiled
Designed for extensibility
Light-weight, few dependencies, clean code
Recursive-descent parse, which is easier to work with and gives helpful error messages, though could get a bit slow if you extend the grammar too much. (for simple grammars like this, it's pretty fast)
ATTRIBUTES
parser
A parser for the language. Responsible for tokenizing the input and building the parse tree.
Defaults to an instance of Language::FormulaEngine::Parser. You can initialize this attribute with an object instance, a class name, or arguments for the default parser.
namespace
A namespace for looking up functions or constants. Also determines some aspects of how the language works, and responsible for providing the perl code when compiling expressions.
Defaults to an instance of Language::FormulaEngine::Namespace::Default. You can initialize this with an object instance, class name, version number for the default namespace, or hashref of arguments for the constructor.
compiler
A compiler for the parse tree. Responsible for generating Perl coderefs, though the Namespace does most of the perl code generation.
Defaults to an instance of Language::FormulaEngine::Compiler. You can initialize this attribute with a class instance, a class name, or arguments for the default compiler.
METHODS
parse
my $formula= $fe->parse( $formula_text, \$error );
Return a Language::FormulaEngine::Formula representing the expression. Dies if it can't parse the expression, unless you supply $error
then the error is stores in that scalarref and the methods returns undef
.
evaluate
my $value= $fe->evaluate( $formula_text, \%variables );
This method creates a new namespace from the default plus the supplied variables, parses the formula, then evaluates it in a recursive interpreted manner, returning the result. Exceptions may be thrown during parsing or execution.
compile
my $coderef= $fe->compile( $formula_text );
Parses and then compiles the $formula_text
, returning a coderef. Exceptions may be thrown during parsing or execution.
CUSTOMIZING THE LANGUAGE
The module is called "FormulaEngine" in part because it is designed to be customized. The functions are easy to extend, the variables are somewhat easy to extend, the compilation can be extended after a little study of the API, and the grammar itself can be extended with some effort.
If you are trying to addd lots of functionality, you might be starting with the wrong module. See the notes in the "SEE ALSO" section.
Adding Functions
The easiest thing to extend is the namespace of available functions. Just subclass Language::FormulaEngine::Namespace and add the functions you want starting with the prefix fn_
.
Complex Variables
The default implementation of Namespace requires all variables to be stored in a single hashref. This default is safe and fast. If you want to traverse nested data structures or call methods, you also need to subclass "get_value" in Language::FormulaEngine::Namespace.
Changing Semantics
The namespace is also in control of the behavior of the functions and operators (which are themselves just functions). It controls both the way they are evaluated and the perl code they generate if compiled.
Adding New Operators
If you want to make small changes to the grammar, such as adding new prefix/suffix/infix operators, this can be accomplished fairly easily by subclassing the Parser. The parser just returns trees of functions, and if you look at the pattern used in the recursive descent parse_*
methods it should be easy to add some new ones.
Bigger Grammar Changes
Any customization involving bigger changes to the grammar, like adding assignments or multi- statement blocks or map/reduce, would require a bigger rewrite. Consider starting with a different more powerful parsing system for that.
SEE ALSO
- Language::Expr
-
A bigger more featureful expression language; perl-like syntax and data structures. Also much more complicated and harder to customize. Can also compile to Javascript!
- Math::Expression
-
General-purpose language, including variable assignment and loops, arrays, and with full attention to security. However, grammar is not customizable at all, and math-centric.
- Math::Expression::Evaluator
-
Very similar to this module, but no string support and not very customizable. Supports assignments, and compilation.
- Math::Expr
-
Similar expression parser, but without string support. Supports arbitrary customization of operators. Not suitable for un-trusted strings, according to BUGS documentation.
AUTHOR
Michael Conrad <mconrad@intellitree.com>
COPYRIGHT AND LICENSE
This software is copyright (c) 2023 by Michael Conrad, IntelliTree Solutions llc.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.