Skip navigation links

Package com.goldencode.expr

Provides a cached, high performance (dynamically compiled) expression processing engine for the generic processing of expressions that can return objects or perform assignment operations.

See: Description

Package com.goldencode.expr Description

Provides a cached, high performance (dynamically compiled) expression processing engine for the generic processing of expressions that can return objects or perform assignment operations.

Author
Eric Faulhaber
Modification Date
July 4, 2005
Access Control
CONFIDENTIAL

Contents

Introduction
Why Compile Expressions?
The Trade-Off
Expression Types
General Syntax
Variables
User Functions
Method Invocation
Supported Operators
Symbol Resolution
Scope
Callback Libraries
Constants
Security
Resolution Error Handling
Debugging
Other Options
Test Driver Example
Just-In-Time Compilation
Known Limitations

Introduction

The expr package was created to address performance issues with often-evaluated runtime expressions. The expression engine implemented in this package dynamically compiles a runtime expression's logic directly into Java bytecode instructions to create a Java object which is well suited to execute the expression repeatedly in performance critical situations.  The object uses callback libraries and variables to interact with client code and to allow client code to provide extended services to the user via the expression engine.  Client code must supply a symbol resolver implementation for this purpose.
 
Each expression is compiled into a discrete class. This is done by assembling a Java class in memory, structured according to the Java class file format (see The JavaTM Virtual Machine Specification, Second Edition by Tim Lindholm, Frank Yellin - ISBN 0-201-43294-3). The class includes the Java bytecode instructions necessary to execute the expression efficiently and quickly, as well as the minimal infrastructure necessary for a valid and verifiable class file. The class is loaded directly from memory into the JVM via a custom class loader. It is not stored in the file system, except when the expression is compiled in debug mode in a special developer build of the expression engine.  Once compiled and loaded into the virtual machine, expression classes are never explicitly unloaded.

The semantics of compiling an expression are encapsulated in the Expression class, and are transparent to client code.  Client code simply prepares an instance of this class using an expression string in infix notation and a symbol resolver object, then invokes a method to execute the expression.  Internally, the infix expression is submitted to the expression compiler for "just-in-time" compilation the first time the expression is to be executed.  The compiler parses the expression, assembles the proper bytecode instructions, compiles a Java class, loads it into the current Java virtual machine, instantiates it, and executes it using a well-known method invocation convention.  The compiled expression instance is cached such that subsequent requests for an instance of that expression object can skip the parsing, assembly, compilation, class loading and instantiation steps entirely and simply return the existing instance immediately.

Why compile expressions?

Expressions are compiled primarily as a performance optimization.  A secondary reason to compile expressions is that certain convenience features, such as auto-boxing, automatic type conversion, automatic null checking, and shorthand notation are built into the expression compiler.  These features allow very robust expression logic to be written in a compact and generally safe manner.

First, let us consider the performance implications.  Traditionally, runtime expressions are evaluated using some variant of the following approach:
  1. Lexically analyze an infix notation expression (e.g., '(A + B == C * 2) OR (A < D)') into tokens.
  2. Parse the tokens into a postfix (a.k.a. reverse Polish) notation (e.g., 'A B + C 2 * == A D < OR'), or into a similarly structured tree.
  3. Evaluate the modified expression completely:  the result of each subexpression determines the input to the next higher subexpression, until the entire expression is evaluated and the result is returned.
This approach is effective and elegant in its simplicity.  However, much time (relatively speaking) is spent lexing and parsing; probably more than is spent actually evaluating the result.  The actual execution of the expression involves the additional overhead of at least one runtime data structure (typically a stack) to manage pushing and popping the operator and operand objects (and casting them to the appropriate data types).  In practical terms, this is not concerning for expressions which must only be evaluated a few times.  However, for expressions that must be evaluated many times quickly, this overhead is considerable.

Also, in a compound logical expression which uses the AND or OR conjunction operators, the above approach does not optimize out unnecessary processing if a result can be determined early in the expression's execution.  Every sub-operation of the expression is evaluated to determine the final result.  Consider that the right side of a compound expression using OR can be ignored completely if the left side evaluates to true, and that the right side of an compound expression using AND can be ignored if the left side evaluates to false.

When an expression is compiled, steps 1 and 2 above are performed only once, the first time the expression is encountered.  Thereafter, the expression class is cached.  Optimizations are performed to convert string constants to the literals they represent.  The bytecode instructions are optimized for compound, logical expressions, so that as little of the expression is processed as is necessary to reach a definitive result.

Next, let us consider the convenience aspects of compiled expressions:
  1. Auto-boxing.  This feature allows seamless integration within an expression between primitive data types and their associated Java wrapper objects.  For instance, a variable of type java.lang.Integer can be assigned an int value directly in an expression;  a method which requires a primitive boolean parameter may be invoked from an expression which passes a java.lang.Boolean object as that parameter.  The expression compiler detects these conditions, determines that these fixups are required, and compiles instructions accordingly to perform the proper wrapping and unwrapping of data at expression execution time.
  2. Automatic Type Conversion.  Similar to auto-boxing, the compiler will detect when a provided data type must (and can) be converted to a required type at expression execution time.  This feature will perform both widening numeric and object reference conversions, as well as narrowing conversions.  Note that this is a double-edged sword;  truncation and data/precision loss may occur in a narrowing conversion.  Thus, some of the safety of strict typing may be lost with this convenience, as the compiler assumes that the author of an expression knows best.
  3. Automatic Null Checking.  In (and only in) expressions which evaluate to a boolean result or which contain sub-expressions which evaluate to a boolean result, object references retrieved at runtime are checked for null before they are dereferenced by the expression, in order to avoid the expression from throwing a NullPointerException.  If an object reference involved in some boolean operation evaluates to null, that operation will always evaluate to false.  On the other hand, if an object reference is passed as a parameter to a method, it is not first checked against null, as the compiler can not presume in this case that a null parameter to a method is invalid.
  4. Shorthand Notation.  Properties of an object accessible to an expression, which are exposed via a bean-like API, can be referenced from an expression using a shorthand notation which eliminates the semantics of method invocation.  For instance, if an object foo has methods, int getBar() and void setBar(int), they can be referred to respectively, in expressions as follows:  foo.bar > 10 and foo.bar = 55.

The Trade-Offs

From a performance perspective, the following trade-offs should be considered.  The first time an expression is encountered, it must be lexed and parsed, just as with the traditional approach.  Additionally, compilation and class loading must be performed.  This front-end loads the performance cost.  For expressions that must be evaluated only once or a handful of times, the break-even point for that performance penalty may not be reached.  Memory consumption should also be considered when evaluating perfomance.  Each unique runtime expression necessitates the creation and loading of a new class.  By contrast, evaluating an expression using the traditional approach creates no lasting increase in memory footprint.  Though the compiled expression classes are quite small, there is some amount of redundant overhead in each class, due to the structure of the Java class file format.

From a convenience perspective, the trade-offs to be considered are the reduction in strict type safety which comes with auto-boxing and auto-conversion.  In some cases, these conversion features may cause ambiguity during method and user function resolution at expression compile time.  However, these issues are usually remedied with well-placed typecasts.  Finally, debugging may require additional effort to understand the compiler's default conversion behaviors, especially since source code level debugging is not possible inside of a compiled expression.  This is because no intermediate Java source code is ever generated or stored;  instead, the compiler accepts an infix notation expression as input and emits an in-memory Java class file image as output.

Expression Types

Two types of expressions are supported:
An assignment expression is one whose top level operation stores a literal, a variable value, or the result of a calculation or method invocation into a variable.  This expression's top level operator is the assignment operator (=).  This expression always returns null.  Thus, inline or nested assignment operations are not possible.  Examples of an assignment expression are:

myVar = 10

myVar = a + b

myVar = null

myVar = foo.getBar()

myVar = a > b

A non-assignment expression is one which returns some object, but which does not assign it to a variable.  Any object type supported by Java may be returned.  It is up to the application or framework which uses the expression to capture the object and act upon it or analyze it according to the needs of the application or framework.  Examples of a non-assignment expression are:

7

true

myVar == a + b

a > b

a * b + 5

foo.getBar()

General Syntax

The syntax supported by the expression engine is largely the same as that supported by the Java language itself.  All symbols are case sensitive.  In addition to basic, scalar expressions, method invocation using familiar Java syntax is supported.  In addition, where the default type conversion behavior is insufficient, type casting is possible.  However, there are a number of differences and exceptions to the expression engine's support for Java syntax:

Variables

Variables are containers which associate a symbol name, a data type, and a particular symbol resolution scope with an object reference (the referent).  Variables are supported directly by the expression engine as first class objects.  Variables must be declared to store a particular type of referent;  this can be any non-primitive data type.  A variable must be registered with the expression engine before it may be referenced by an expression;  otherwise, the referencing expression will fail to compile.  Variables are segregated into variable pools by associating a scope with a variable during registration.

A variable can be initialized by an arbitrary expression.  This expression may refer to other variables which share scope with this variable, and which have been registered before it.  Circular variable references within initializer expressions are not permitted.  The initializer expression is executed whenever the variable's reset method is invoked (including during Variable construction), and the expression's result becomes the new referent of the variable.  If no initializer expression is provided, the variable is initialized to null.

Variables are represented by instances of the Variable class.  An instance of this class is created only when a variable is registered with the symbol resolver;  this class cannot be instantiated directly.  Variables optionally may be registered as read-only.  In this mode, expressions which reference a variable may access the referent of a variable, but may not change it.  This restriction holds only for variable access from within expressions, but not for programmatic manipulation of a Variable object.  The latter is always permitted, regardless of the read-only state of the variable.

Variable Syntax

A variable is used in an expression with Java-like syntax.  For example, assuming a variable myVar has been registered as type Bar, the expression

myVar = foo.getBar()

represents an assignment to myVar.  The expression

myVar.doSomething()

invokes the method doSomething() on myVar's referent, an instance of Bar.  Note that any assignment, access, comparison, invocation, etc. against a variable always is applied against the variable's referent.  Because of the automatic null checking, auto-boxing, and automatic type conversion features of the expression engine, variables which represent primitive values using wrapper objects can interact naturally with primitive literals.  For example, given a variable num, registered as type java.lang.Integer, the following expression is perfectly valid, despite the apparent type mismatch:

num >= 100.5

In this case, num's referent is checked against null (auto null checking), then unwrapped to an int (auto-boxing), then widened to a double (auto-conversion) before the comparison takes place (in the event num's referent is null, this expression would return false).

User Functions

User functions are a means of accessing application-provided functionality from within expressions.  User functions are exposed by a callback library which is registered with the symbol resolver.  The callback library defines the method which provides the backing implementation of a user function.  A user function can provide any capability required by the application;  there are effectively no functional limits.  A user function can accept any number of arguments (including a variable argument list), or none at all.  It can return any object or primitive value, or void.

User Function Syntax

A user function is invoked from an expression using Java-like syntax.  For example:

someUserFunction(myVar, 39, constant)

It is recommended to qualify the invocation with the name of the callback library which implements the user function as follows:

myLib.someUserFunction(myVar, 39, constant)

Doing so ensures no ambiguity in resolving the correct function if more than one callback library implements a method with the same name and parameter signature.  In the case where this occurs and there is no disambiguating qualifier, an AmbiguousSymbolException is thrown at expression compile time.

Argument types are somewhat flexible, within the bounds of the auto-boxing and automatic type conversion features the expression engine provides.  For example, a user function which is defined to accept a parameter of type int can in fact accept any numeric primitive or numeric primitive wrapper object.  Primitive unwrapping operations and narrowing conversions will occur as necessary to allow the user function to resolve at expression compile time.  Note that this behavior may result in ambiguity among user function alternatives which may require a typecast to eliminate.  Consider, for instance two alternatives of a method within callback library myLib which implement different user functions:
and the expression:

myLib.foo(25)

This expression creates ambiguity, because both alternatives would be candidates for a match during user function resolution.  The solution to this problem is to use an explicit typecast to disambiguate, as in:

myLib.foo(#(long) 25)

which forces a widening conversion of the literal 25 to a long, or

myLib.foo(#(java.lang.Integer) 25)

which forces the literal 25 to be wrapped into an instance of java.lang.Integer.  Note that where there is no such ambiguity, the typecast is unnecessary.

Variable Length Argument Lists

Variable length argument lists (varargs) are supported by the expression engine, even when using JVM releases prior to J2SE 5.0.  To implement backing support for a user function, a callback library method is defined to accept an array of java.lang.Objects as its final parameter.  For instance:

public void myVarArgFunction(Object[] args)
{
    // process variable arguments
    for (int i = 0; i < args.length; i++)
    {
       ...
    }
}

No parameters should come after the Object array in the method definition.  If any parameters are required, these must be explicitly listed preceding the Object array, as in:

public void myOtherVarArgFunction(int len, String text, Object[] args)
{
    // process required arguments 'len' and 'text'
    ...

    // process variable arguments
    for (int i = 0; i < args.length; i++)
    {
       ...
    }
}

The array of Objects passed to such methods is guaranteed to be non-null.  In the event an expression invokes such a user function, but passes no parameters for the variable portion of the parameter list, the Object array will be an array of size zero.  Invocation of the above user functions is straightforward.  The following expressions are all valid;  note, however, that NullPointerException or ClassCastException may be thrown by the backing method if a null argument is dereferenced or an argument is assumed to be an incorrect type, respectively:

Method Invocation

The expression engine supports the direct invocation of arbitrary methods against an object reference using the dot (.) operator.  This concept differs from user function invocation in that it requires no explicit registration of a callback library of backing methods with the symbol resolver, but it always requires an object instance upon which to apply the invocation (even for static methods).  Thus, an unqualified method invocation will fail to compile, since unlike Java, the expression engine provides is no implicit this reference.  Likewise, a static method invocation qualified by a class name will fail to compile, since there is no implicit class resolution.

Method Invocation Syntax

Syntax for method invocation in an expression is quite similar to method invocation in Java, considering the caveats mentioned above and in the General Syntax section.  It takes the form:

<object reference>.<method name>([param1 [, ...]])

For instance, given a variable string1 of type java.lang.String, initialized to "Hello World", the following represents a valid invocation of a java.lang.String method:

string1.indexOf('He')

which would return 0 upon execution.

Variable Length Argument Lists

Varargs are supported as they are with user functions.  Please refer to the discussion of this topic in the User Functions section for details.

Symbol Resolution

Variables, user functions, methods, and constants are all represented as symbols within an expression.  In order for an expression to compile correctly, each symbol it references must be resolved to a backing method or literal value.  All symbol resolution occurs at expression compile time.  Contrary to previous implementations of this framework, no symbol resolution is deferred to expression execution time.  This provides for both enhanced performance and enhanced security, because expensive lookups and permission checks are performed only once, at compile time.  If they fail, the expression fails to compile and can never be executed.

Scope

Callback Libraries

Constants

Security

Resolution Error Handling

Supported Operators

The set of operators which may be used within expressions is listed in the table below. Operators in this table are listed in order of their precedence, from those evaluated first to those evaluated last. Operators which have the same precedence are grouped together. When evaluating operations whose operators have the same precedence, operations are performed in the order in which they appear, from left to right. Parentheses (()) may be used to group operations which must be evaluated in a different order.

Precedence Symbol Type Unary/Binary Operation Performed
1
! or not
Logical Unary Logical complement
2
~
Bitwise Binary Bitwise complement
3
-
Arithmetic
Unary
Negation
4
*
Arithmetic Binary Multiplication
/
Arithmetic Binary Division
%
Arithmetic Binary Remainder
5
+
Arithmetic Binary Addition
-
Arithmetic Binary Subtraction
6
<<
Bitwise Binary Left shift
>>
Bitwise Binary Right shift w/ sign extension
>>>
Bitwise Binary Right shift w/ zero extension
7
<
Logical Binary Is less than
<=
Logical Binary Is less than or equal to
>
Logical Binary Is greater than
>=
Logical Binary Is greater than or equal to
8
==
Logical Binary Is equal to
!=
Logical Binary Is not equal to
9
&
Bitwise Binary Bitwise AND
10
^
Bitwise Binary Bitwise XOR
11
|
Bitwise Binary Bitwise OR
12
&& or and
Logical Binary Conditional AND
13
|| or or
Logical Binary Conditional OR

Resolving Symbols (TBD - update)

All but the most trivial expressions will contain symbols which have some application-specific meaning. These represent placeholders in the expression for values to be substituted when the expression is executed (or when it is compiled). To enable client code to provide these value substitutions at the appropriate times, compiled expression objects use a callback model. This model is described by the SymbolResolver interface.

When submitting an expression for compilation, client code must provide a SymbolResolver object. This object will be called:
  1. by the expression parser during the parsing phase, when it encounters a string constant which must be resolved into a literal;
  2. by the expression compiler during the compilation phase, when it encounters one of the following constructs in the expression:
    • a variable for which it requires data type information;
    • a user function for which it requires return type information;
  3. by the compiled expression object itself during the runtime execution phase, when it needs to:
    • retrieve the current value of a variable;
    • execute a user function.
The SymbolResolver implementation supplied by client code is responsible for providing the correct substitution values based upon the context of the application at the time it executes an expression; the expression object itself has no awareness of the application's current state. The TestDriver class is a sample implementation of the SymbolResolver interface. It is discussed in greater detail below.

How Resolution Errors are Handled

If a string constant cannot be resolved during expression parsing, an error is reported to stderr, followed by an ExpressionException thrown at compile time. Client code generally should not throw an exception from the SymbolResolver resolveXXX methods; the subclasses of CompiledExpression do not have exception handlers, as these are expensive constructs. Any exception thrown by client code during symbol resolution will propagate up the call stack back to the client code which called ArithmeticExpression.compute() or LogicalExpression.evaluate().

Instead, if no suitable variable substitution exists, null should be returned from a SymbolResolver resolveXXX method. A null return is handled differently by ArithmeticExpression and LogicalExpression objects, as described below.

Arithmetic Expressions

An unresolved variable is a fatal condition for an arithmetic expression, since it cannot compute a result if a component value is missing. If null is returned from a variable resolver callback method, ArithmeticExpression throws an UnresolvedSymbolException.

Logical Expressions

Logical expressions are more lenient to null values returned by the variable resolver. However, the implication of this leniency is that unexpected results may occur. Consider the following expression:
MYVAR == 10
This expression is interpreted internally by the expression engine as
MYVAR != null and MYVAR == 10
Thus, if MYVAR cannot be resolved to a substitute value, then this expression would evaluate to false. This treatment of the above expression seems fairly intuitive. However, it should be noted that the expression
MYVAR != 10
also will evaluate to false if the variable MYVAR cannot be resolved at evaluation time. This may not be as immediately intuitive as the first example, but this behavior is consistent. This is because the latter expression is interpreted internally by the expression engine as
MYVAR != null and MYVAR != 10
Finally it should be noted that the expressions:
MYVAR != 10
and
!(MYVAR == 10)
are not equivalent. The former will return false if MYVAR cannot be resolved (as discussed above), while the latter will return true in the same circumstance. This is because the latter expression is interpreted internally by the expression engine as
!(MYVAR != null and MYVAR == 10)
Distributing the negation operator (!) across both subexpressions inside the parentheses yields
MYVAR == null or MYVAR != 10
Thus, the left side of the expression evaluates to true if MYVAR cannot be resolved.  In this event, the right side of the expression is ignored and the overall expression returns true. It is important to keep in mind the subtleties of how unresolved variables are handled when considering input expressions.

Test Driver Example (TBD - obsolete)

The TestDriver class represents a trivial application which loads records from a simple database (implemented as a properties file) and allows expressions to be executed against these records. Either arithmetic or logical expressions can be evaluated against any valid range of records defined in the properties.

The program interprets the basic data types used by compiled expression objects. In addition, it processes one higher level data type -- a date -- which allows it to use the parsing capabilities of java.util.SimpleDateFormat to process date and time string constants. Data and time string constants are converted by the application to Long values (number of milliseconds since midnight, Jan. 1, 1970) for use in expressions.

The TestDriver class serves as an illustrative example of a VariableResolver implementation and of an ExpressionCompiler client. It can be launched from the command line, or may be used programmatically as a test harness. See TestDriver's main method for usage syntax and its class decription for a sample properties format.

In its default properties configuration, the program manages the following information about a set of hypothetical employees:

Field Name
Application-Defined
Data Type
Compiled Expression
Data Type
Notes
name
string
string
Employee name
dob
date
long
Employee date of birth
(translates to #millis since 01/01/1970)
overtime
double
double
Overtime hours for the current period
city
string
string
City in which employee works
union
boolean
long
Employee's union affiliation
begin
date
long
Time employee's regular work shift begins
(translates to #millis since 01/01/1970)
end
date
long
Time employee's regular work shift ends
(translates to #millis since 01/01/1970)

The following boolean expression determines who is affliated with a union:

union

This is the equivalent:

union == true

Running either of these expressions against all default records should produce the following results:

[1] true
[2] true
[3] false


The following arithmetic expression determines the approximate age in years of each employee (recall that time values are measured in milliseconds):

(@now() - dob) / 86400000 / 365.25

This results in:

[1] 54.908003341413796
[2] 44.55904144317058
[3] 59.077750090215986

Notice the use of the @now() user function in the above expression. From the compiled expression's point of view, this symbol is simply a variable reference to be resolved to a numeric value. It is recognized by the application code (in the implementation of the resolveToLong method) as having special meaning. As a result, special processing is invoked to resolve this variable to the current date and time (as a millisecond value).

String constants may be used in comparisons; they must be enclosed in single quotes:

name == 'Larry' or city == 'St. Louis'

This produces the following results with the default data records:

[1] true
[2] true
[3] false

Omitting the single quotes from a string constant will result in either an ExpressionException being thrown, or in a potentially incorrect result. The unquoted symbol will not be recognized as a string constant, but will instead be treated as a variable, or as an unexpected token, depending upon the contents of the string constant. For instance, the results of leaving the quotes off 'Larry' in the example above results in a valid expression, but Larry is treated as a variable at runtime, and resolves to null. This results in a successful compilation and execution, but the results are probably not what was intended:

[1] false
[2] true
[3] false

On the other hand, if the quotes are instead removed from 'St. Louis' in the same expression, the compiler treats this condition as a fatal error, since the string Louis is now interpreted as an unexpected, extra token, which invalidates the expression.

The TestDriver application should be explored and tested with a variety of expressions, to determine the capabilities (and limitations) of expression processing with the expr package.

Expression Compilation (TBD - update)

Compilation of an expression into Java classes is a complex process. This section explains the flow of this process in some detail, but it is not intended to be exhaustive.

The first step of expression compilation is a combined lexing/parsing of the expression string from infix notation to a list of tokens in postfix notation. To extend the TestDriver example above, the following expression in infix notation:

union and begin < '08:00:00'

determines which employees have a union affiliation and work a shift which begins earlier than 8 o'clock in the morning. This expression string is converted in this initial stage into the following token stream in postfix order:
  union      begin    '08:00:00'     <         and

(variable (variable (constant (binary (binary
operand) operand) operand) operator) operator)
The expression compiler creates a skeleton representation of a Java class file in memory. It generates a unique name for the new class, and sets its superclass to ArithmeticExpression for numeric expressions and to LogicalExpression for boolean expressions. It creates a default constructor and a stub execution method (compute() for numeric, evaluate() for boolean) for the new class. It is into this method which the expression's logic will be distilled as Java bytecode instructions.

Now that the expression's tokens are arranged in postfix order, they are iterated from left to right. If an operand is encountered, it is pushed onto a compile-time operand stack. Variable operands are pushed directly onto the stack as strings; they must be resolved into replacement values at runtime. The compiler calls back to the variable resolver to request the data type of each variable. This allows the compiler to select an appropriate callback method for use at runtime, to minimize the amount of type casting necessary.

When the iteration encounters an operator, the operand stack is popped to retrieve the operand(s) to which the operator will be applied: one operand is popped for a unary operator, two for a binary operator. The first operator encountered in the example above is the less-than (<) operator, which pops the constant string '08:00:00' and the variable begin. Once the operator has the needed operand(s), it examines them to determine what Java bytecode instructions are required to handle each at runtime.

If the operand is a constant string, the compiler will ask the variable resolver to resolve it into a simpler representation. For instance, the constant string '08:00:00' represents a time, which this application chooses to resolve to a number of milliseconds since midnight. When the compiler encounters this token, it calls TestDriver's resolveConstant method, which returns a Long with an internal value of 46800000. It is this simpler numeric representation which is compiled into the class as a constant, avoiding the need for the application to interpret the string '08:00:00' each time the expression is executed.

If the operand is a variable, bytecode instructions to access the expression's variable resolver and to make the appropriate type of callback are first assembled. Instructions for a null check on the callback result are then appended. Finally, whether the operand is a variable or a constant, instructions are appended to push the resolved value onto the JVM's runtime operand stack.

Once each operand's code has been assembled, bytecode instructions are added to perform the task of popping the analogous runtime operands off the JVM's runtime stack, and to perform the operator's actual runtime function. The assembled code unit for the sub-expression (e.g., begin < '08:00:00' in this case) is then pushed back onto the compile time operand stack.

When the logical, binary operator and is encountered, the compiler pops this code unit and the union variable operand off the stack. It recognizes that the code unit operand has already been processed and only assembles bytecode instructions for the union operand and for the logical and operation. As an optimization, the algorithm which assembles instructions for the logical and operation ensures that if the union variable resolves to false at runtime, it will jump directly to the end of the method to return false, skipping the evaluation of the begin < '08:00:00' portion of the expression entirely.

Bytecode instructions to return the expression's overall result (in this case true or false) are finally assembled for the end of the method, branching logic code fixups and bytecode offset fixups are made, the class' constant pool is indexed, and the finished class file is written to a byte array in memory for further class loading and caching. If the expression was compiled in debug mode, the byte array is written to the file system as a class file, at a location specified during construction of the ExpressionCompiler object.

The following output displays the internal representation of the Java class which is created for the above expression:
Class filename:  LE0.class
Magic Number : 0xCAFEBABE
Version : 45.3
This Class : com/goldencode/expr/LE0
Super Class : com/goldencode/expr/LogicalExpression
Access Flags : ACC_PUBLIC

Constant Count: 0x24 (36 dec)
1: <String> begin
2: <String> union
3: <Class> com/goldencode/expr/CompiledExpression
4: <Class> com/goldencode/expr/LE0
5: <Class> com/goldencode/expr/LogicalExpression
6: <Class> com/goldencode/expr/VariableResolver
7: <Class> java/lang/Long
8: <Field> com/goldencode/expr/CompiledExpression.resolverLcom/goldencode/expr/VariableResolver;
9: <Method> com/goldencode/expr/LogicalExpression.<init>()V
A: <Method> java/lang/Long.longValue()J
B: <InterfaceMethod> com/goldencode/expr/VariableResolver.resolveToLong(Ljava/lang/String;)Ljava/lang/Long;
C: <Double> 4.68E7
E: <NameAndType> <init>()V
F: <NameAndType> longValue()J
10: <NameAndType> resolveToLong(Ljava/lang/String;)Ljava/lang/Long;
11: <NameAndType> resolverLcom/goldencode/expr/VariableResolver;
12: <Utf8> ()J
13: <Utf8> ()V
14: <Utf8> ()Z
15: <Utf8> (Ljava/lang/String;)Ljava/lang/Long;
16: <Utf8> <init>
17: <Utf8> Code
18: <Utf8> Lcom/goldencode/expr/VariableResolver;
19: <Utf8> begin
1A: <Utf8> com/goldencode/expr/CompiledExpression
1B: <Utf8> com/goldencode/expr/LE0
1C: <Utf8> com/goldencode/expr/LogicalExpression
1D: <Utf8> com/goldencode/expr/VariableResolver
1E: <Utf8> evaluate
1F: <Utf8> java/lang/Long
20: <Utf8> longValue
21: <Utf8> resolveToLong
22: <Utf8> resolver
23: <Utf8> union

Interface Count: 0

Field Count : 0

Method Count : 2
0: <Method> ACC_PUBLIC <init>()V
<Code> Max stack: 1, Max locals: 1
5 bytes of code:
0000 0x2A <aload_0 >
0001 0xB7 <invokespecial > 0009 [<Method> com/goldencode/expr/LogicalExpression.<init>()V]
0004 0xB1 <return >
1: <Method> ACC_PUBLIC evaluate()Z
<Code> Max stack: 4, Max locals: 4
65 bytes of code:
0000 0x2A <aload_0 >
0001 0xB4 <getfield > 0008 [<Field> com/goldencode/expr/CompiledExpression.resolverLcom/goldencode/expr/VariableResolver;]
0004 0x4C <astore_1 >
0005 0x2B <aload_1 >
0006 0x12 <ldc > 02 [<String> union]
0008 0x4D <astore_2 >
0009 0x2C <aload_2 >
000A 0xB9 <invokeinterface> 000B [<InterfaceMethod> com/goldencode/expr/VariableResolver.resolveToLong(Ljava/lang/String;)Ljava/lang/Long;] 02 00
000F 0x4E <astore_3 >
0010 0x2D <aload_3 >
0011 0xC7 <ifnonnull > 0006 [dest:0017]
0014 0xA7 <goto > 0029 [dest:003D]
0017 0x2D <aload_3 >
0018 0xB6 <invokevirtual > 000A [<Method> java/lang/Long.longValue()J]
001B 0x88 <l2i >
001C 0x99 <ifeq > 0021 [dest:003D]
001F 0x2B <aload_1 >
0020 0x12 <ldc > 01 [<String> begin]
0022 0x4D <astore_2 >
0023 0x2C <aload_2 >
0024 0xB9 <invokeinterface> 000B [<InterfaceMethod> com/goldencode/expr/VariableResolver.resolveToLong(Ljava/lang/String;)Ljava/lang/Long;] 02 00
0029 0x4E <astore_3 >
002A 0x2D <aload_3 >
002B 0xC7 <ifnonnull > 0006 [dest:0031]
002E 0xA7 <goto > 000F [dest:003D]
0031 0x2D <aload_3 >
0032 0xB6 <invokevirtual > 000A [<Method> java/lang/Long.longValue()J]
0035 0x8A <l2d >
0036 0x14 <ldc2_w > 000C [<Double> 4.68E7]
0039 0x98 <dcmpg >
003A 0x9B <iflt > 0005 [dest:003F]
003D 0x03 <iconst_0 >
003E 0xAC <ireturn >
003F 0x04 <iconst_1 >
0040 0xAC <ireturn >

Attribute Count: 0

Known Limitations (TBD - update)

The expr package has a number of known limitations. Some are the result of conscious design decisions, since the overriding concern in the development of this package was high performance. Others are bugs or unwelcome side effects of the implementation which may be addressed with future development.

Compiled expression objects not thread-safe

Compiled expression objects should not be used simultaneously from multiple threads. The ArithmeticExpression.compute() and LogicalExpression.evaluate() methods are not synchronized, as synchronization is expensive and is not needed in many use cases. In any case, synchronization only at the level of these methods would not be enough to ensure thread safety, since the VariableResolver itself is the most important component whose state must be synchronized to ensure the integrity of an expression result. Where it is necessary at all, synchronization must be done at the application level.

This limitation is the result of a conscious design decision; it will not be removed in future development of the package.
Skip navigation links
Copyright (c) 2004-2017, Golden Code Development Corporation.
ALL RIGHTS RESERVED. Use is subject to license terms.