Expression Syntax
The expressions used to define
Synthetic Columns and some
Row Subsets are actually expressions in
the Java language, which are compiled into Java bytecode before
evaluation. However, this does not mean that you need to be a
Java programmer to write them. The syntax is pretty similar to C.
The following explanation gives
some guidance and examples for writing these expressions.
Unfortunately a complete tutorial on writing Java expressions is beyond
the scope of this document, but it should provide enough information
for even a novice to write useful expressions.
Referencing cell values
To create a useful expression for a cell in a column, you will
have to refer to other cells in different columns of the same table row.
You can do this in two ways:
- By Name
- The Name of the column may be used if it is unique (no other column in
the table has the same name) and if it has a suitable form.
This means that it must have the form of a Java variable - basically
starting with a letter and continuuing with
letters or numbers. In particular it cannot have any spaces in it.
The underscore and currency symbols count as
letters for this purpose.
Column names are treated case-insensitively.
- By $ID
- The "$ID" identifier of the column may always be used to refer to it.
This is just a "$" sign followed by a unique integer assigned by the
viewer to each column when it is first encountered.
You can find out the $ID identifier by looking in the
Column Metadata Window.
There is a special column whose name is "Index" and whose ID is "$0".
The value of this is the same as the row number in the unsorted table
(the grey numbers on the left of the grid in the main browser window).
The value of the variables so referenced will be a primitive
(boolean, byte, short, char, int, long, float, double) if the
column contains one of the corresponding types. Otherwise it will
be an Object of the type held by the column.
Referencing Row Subset flags
If you have any Row Subsets defined you can also access the value
of the boolean (true/false) flag indicating whether the current row
is in that each subset. Again there are two ways of doing this:
- By Name
- The name assigned to the subset when it was created can be used
if it is unique and if it has a suitable form. The same comments
apply as to column names above.
- By #ID
- The "#ID" identifier of the subset may always be used to refer to it.
As for the "$ID" identifier for columns above, this is a unique
index preceded by a special symbol, "#".
In either case, the value will be a boolean value; these can be useful
in conjunction with the ternary "? :" operator or
when combining existing subsets using logical operators to create
a new subset.
Operators
The operators are pretty much the same as in C.
The common ones are:
- Arithmetic
-
- + (add)
- - (subtract)
- * (mutliply)
- / (divide)
- % (modulus)
- Logical
-
- ! (not)
- && (and)
- || (or)
- == (identity)
- != (non-identity)
- < (less than)
- > (greater than)
- <= (less than or equal)
- >= (greater than or equal)
- Other
-
- [] (array dereferencing)
- ?: (conditional switch)
- instanceof (class membership)
All the methods of the Math and String
classes are available too,
for instance:
- Math static methods
-
- abs(x) (absolute value)
- cos(x) (cosine)
- sqrt(x) (square root)
- max(a,b) (maximum)
- pow(a,b) (exponentiation)
- toRadians(deg) (angle conversion)
- String instance methods
-
- startsWith(s) (comparison)
- equals(s) (equality)
- equalsIgnoreCase(s) (case-insensitive)
- matches(regex) (regular expression matching)
Examples
Here are some examples for synthetic columns:
- Average
( first + second ) * 0.5
- Outlier clipping
min( 1000, max( value, 0 ) )
and here are some examples of boolean expressions that could be used
to define Row Subsets (or to create boolean synthetic columns):
- Within a circle
sqrt( pow($2,2) + pow($3,2) ) < 0.01
- String matching
CONSTELLATION.equalsIgnoreCase( "cygnus" )
Combining subsets
( #1 && #2 ) && ! #3