Class SubsetByExpression

All Implemented Interfaces:
Serializable, CapabilitiesHandler, OptionHandler, RevisionHandler

public class SubsetByExpression extends SimpleBatchFilter
Filters instances according to a user-specified expression.

Grammar:

boolexpr_list ::= boolexpr_list boolexpr_part | boolexpr_part;

boolexpr_part ::= boolexpr:e {: parser.setResult(e); :} ;

boolexpr ::= BOOLEAN
| true
| false
| expr < expr
| expr <= expr
| expr > expr
| expr >= expr
| expr = expr
| ( boolexpr )
| not boolexpr
| boolexpr and boolexpr
| boolexpr or boolexpr
| ATTRIBUTE is STRING
;

expr ::= NUMBER
| ATTRIBUTE
| ( expr )
| opexpr
| funcexpr
;

opexpr ::= expr + expr
| expr - expr
| expr * expr
| expr / expr
;

funcexpr ::= abs ( expr )
| sqrt ( expr )
| log ( expr )
| exp ( expr )
| sin ( expr )
| cos ( expr )
| tan ( expr )
| rint ( expr )
| floor ( expr )
| pow ( expr for base , expr for exponent )
| ceil ( expr )
;

Notes:
- NUMBER
any integer or floating point number
(but not in scientific notation!)
- STRING
any string surrounded by single quotes;
the string may not contain a single quote though.
- ATTRIBUTE
the following placeholders are recognized for
attribute values:
- CLASS for the class value in case a class attribute is set.
- ATTxyz with xyz a number from 1 to # of attributes in the
dataset, representing the value of indexed attribute.

Examples:
- extracting only mammals and birds from the 'zoo' UCI dataset:
(CLASS is 'mammal') or (CLASS is 'bird')
- extracting only animals with at least 2 legs from the 'zoo' UCI dataset:
(ATT14 >= 2)
- extracting only instances with non-missing 'wage-increase-second-year'
from the 'labor' UCI dataset:
not ismissing(ATT3)

Valid options are:

 -E <expr>
  The expression to use for filtering
  (default: true).
 -F
  Apply the filter to instances that arrive after the first
  (training) batch. The default is to not apply the filter (i.e.
  always return the instance)
Version:
$Revision: 9804 $
Author:
fracpete (fracpete at waikato dot ac dot nz)
See Also:
  • Constructor Details

    • SubsetByExpression

      public SubsetByExpression()
  • Method Details

    • globalInfo

      public String globalInfo()
      Returns a string describing this filter.
      Specified by:
      globalInfo in class SimpleFilter
      Returns:
      a description of the filter suitable for displaying in the explorer/experimenter gui
    • input

      public boolean input(Instance instance) throws Exception
      Input an instance for filtering. Filter requires all training instances be read before producing output (calling the method batchFinished() makes the data available). If this instance is part of a new batch, m_NewBatch is set to false.
      Overrides:
      input in class SimpleBatchFilter
      Parameters:
      instance - the input instance
      Returns:
      true if the filtered instance may now be collected with output().
      Throws:
      IllegalStateException - if no input structure has been defined
      Exception - if something goes wrong
      See Also:
    • listOptions

      public Enumeration listOptions()
      Returns an enumeration describing the available options.
      Specified by:
      listOptions in interface OptionHandler
      Overrides:
      listOptions in class SimpleFilter
      Returns:
      an enumeration of all the available options.
    • setOptions

      public void setOptions(String[] options) throws Exception
      Parses a given list of options.

      Valid options are:

       -E <expr>
        The expression to use for filtering
        (default: true).
       -F
        Apply the filter to instances that arrive after the first
        (training) batch. The default is to not apply the filter (i.e.
        always return the instance)
      Specified by:
      setOptions in interface OptionHandler
      Overrides:
      setOptions in class SimpleFilter
      Parameters:
      options - the list of options as an array of strings
      Throws:
      Exception - if an option is not supported
      See Also:
      • SimpleFilter.reset()
    • getOptions

      public String[] getOptions()
      Gets the current settings of the filter.
      Specified by:
      getOptions in interface OptionHandler
      Overrides:
      getOptions in class SimpleFilter
      Returns:
      an array of strings suitable for passing to setOptions
    • getCapabilities

      public Capabilities getCapabilities()
      Returns the Capabilities of this filter.
      Specified by:
      getCapabilities in interface CapabilitiesHandler
      Overrides:
      getCapabilities in class Filter
      Returns:
      the capabilities of this object
      See Also:
    • setExpression

      public void setExpression(String value)
      Sets the expression used for filtering.
      Parameters:
      value - the expression
    • getExpression

      public String getExpression()
      Returns the expression used for filtering.
      Returns:
      the expression
    • expressionTipText

      public String expressionTipText()
      Returns the tip text for this property.
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setFilterAfterFirstBatch

      public void setFilterAfterFirstBatch(boolean b)
      Set whether to apply the filter to instances that arrive once the first (training) batch has been seen. The default is to not apply the filter and just return each instance input. This is so that, when used in the FilteredClassifier, a test instance does not get "consumed" by the filter and a prediction is always generated.
      Parameters:
      b - true if the filter should be applied to instances that arrive after the first (training) batch has been processed.
    • getFilterAfterFirstBatch

      public boolean getFilterAfterFirstBatch()
      Get whether to apply the filter to instances that arrive once the first (training) batch has been seen. The default is to not apply the filter and just return each instance input. This is so that, when used in the FilteredClassifier, a test instance does not get "consumed" by the filter and a prediction is always generated.
      Returns:
      true if the filter should be applied to instances that arrive after the first (training) batch has been processed.
    • filterAfterFirstBatchTipText

      public String filterAfterFirstBatchTipText()
      Returns the tip text for this property.
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getRevision

      public String getRevision()
      Returns the revision string.
      Specified by:
      getRevision in interface RevisionHandler
      Overrides:
      getRevision in class Filter
      Returns:
      the revision
    • main

      public static void main(String[] args)
      Main method for running this filter.
      Parameters:
      args - arguments for the filter: use -h for help