Class Dagging

All Implemented Interfaces:
Serializable, Cloneable, CapabilitiesHandler, OptionHandler, Randomizable, RevisionHandler, TechnicalInformationHandler

This meta classifier creates a number of disjoint, stratified folds out of the data and feeds each chunk of data to a copy of the supplied base classifier. Predictions are made via majority vote, since all the generated base classifiers are put into the Vote meta classifier.
Useful for base classifiers that are quadratic or worse in time behavior, regarding number of instances in the training data.

For more information, see:
Ting, K. M., Witten, I. H.: Stacking Bagged and Dagged Models. In: Fourteenth international Conference on Machine Learning, San Francisco, CA, 367-375, 1997.

BibTeX:

 @inproceedings{Ting1997,
    address = {San Francisco, CA},
    author = {Ting, K. M. and Witten, I. H.},
    booktitle = {Fourteenth international Conference on Machine Learning},
    editor = {D. H. Fisher},
    pages = {367-375},
    publisher = {Morgan Kaufmann Publishers},
    title = {Stacking Bagged and Dagged Models},
    year = {1997}
 }
 

Valid options are:

 -F <folds>
  The number of folds for splitting the training set into
  smaller chunks for the base classifier.
  (default 10)
 -verbose
  Whether to print some more information during building the
  classifier.
  (default is off)
 -S <num>
  Random number seed.
  (default 1)
 -D
  If set, classifier is run in debug mode and
  may output additional info to the console
 -W
  Full name of base classifier.
  (default: weka.classifiers.functions.SMO)
 
 Options specific to classifier weka.classifiers.functions.SMO:
 
 -D
  If set, classifier is run in debug mode and
  may output additional info to the console
 -no-checks
  Turns off all checks - use with caution!
  Turning them off assumes that data is purely numeric, doesn't
  contain any missing values, and has a nominal class. Turning them
  off also means that no header information will be stored if the
  machine is linear. Finally, it also assumes that no instance has
  a weight equal to 0.
  (default: checks on)
 -C <double>
  The complexity constant C. (default 1)
 -N
  Whether to 0=normalize/1=standardize/2=neither. (default 0=normalize)
 -L <double>
  The tolerance parameter. (default 1.0e-3)
 -P <double>
  The epsilon for round-off error. (default 1.0e-12)
 -M
  Fit logistic models to SVM outputs. 
 -V <double>
  The number of folds for the internal
  cross-validation. (default -1, use training data)
 -W <double>
  The random number seed. (default 1)
 -K <classname and parameters>
  The Kernel to use.
  (default: weka.classifiers.functions.supportVector.PolyKernel)
 
 Options specific to kernel weka.classifiers.functions.supportVector.PolyKernel:
 
 -D
  Enables debugging output (if available) to be printed.
  (default: off)
 -no-checks
  Turns off all checks - use with caution!
  (default: checks on)
 -C <num>
  The size of the cache (a prime number), 0 for full cache and 
  -1 to turn it off.
  (default: 250007)
 -E <num>
  The Exponent to use.
  (default: 1.0)
 -L
  Use lower-order terms.
  (default: no)
Options after -- are passed to the designated classifier.

Version:
$Revision: 5306 $
Author:
Bernhard Pfahringer (bernhard at cs dot waikato dot ac dot nz), FracPete (fracpete at waikato dot ac dot nz)
See Also:
  • Constructor Details

    • Dagging

      public Dagging()
      Constructor.
  • Method Details

    • globalInfo

      public String globalInfo()
      Returns a string describing classifier
      Returns:
      a description suitable for displaying in the explorer/experimenter gui
    • getTechnicalInformation

      public TechnicalInformation getTechnicalInformation()
      Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
      Specified by:
      getTechnicalInformation in interface TechnicalInformationHandler
      Returns:
      the technical information about this class
    • listOptions

      public Enumeration listOptions()
      Returns an enumeration describing the available options.
      Specified by:
      listOptions in interface OptionHandler
      Overrides:
      listOptions in class RandomizableSingleClassifierEnhancer
      Returns:
      an enumeration of all the available options.
    • setOptions

      public void setOptions(String[] options) throws Exception
      Parses a given list of options.

      Valid options are:

       -F <folds>
        The number of folds for splitting the training set into
        smaller chunks for the base classifier.
        (default 10)
       -verbose
        Whether to print some more information during building the
        classifier.
        (default is off)
       -S <num>
        Random number seed.
        (default 1)
       -D
        If set, classifier is run in debug mode and
        may output additional info to the console
       -W
        Full name of base classifier.
        (default: weka.classifiers.functions.SMO)
       
       Options specific to classifier weka.classifiers.functions.SMO:
       
       -D
        If set, classifier is run in debug mode and
        may output additional info to the console
       -no-checks
        Turns off all checks - use with caution!
        Turning them off assumes that data is purely numeric, doesn't
        contain any missing values, and has a nominal class. Turning them
        off also means that no header information will be stored if the
        machine is linear. Finally, it also assumes that no instance has
        a weight equal to 0.
        (default: checks on)
       -C <double>
        The complexity constant C. (default 1)
       -N
        Whether to 0=normalize/1=standardize/2=neither. (default 0=normalize)
       -L <double>
        The tolerance parameter. (default 1.0e-3)
       -P <double>
        The epsilon for round-off error. (default 1.0e-12)
       -M
        Fit logistic models to SVM outputs. 
       -V <double>
        The number of folds for the internal
        cross-validation. (default -1, use training data)
       -W <double>
        The random number seed. (default 1)
       -K <classname and parameters>
        The Kernel to use.
        (default: weka.classifiers.functions.supportVector.PolyKernel)
       
       Options specific to kernel weka.classifiers.functions.supportVector.PolyKernel:
       
       -D
        Enables debugging output (if available) to be printed.
        (default: off)
       -no-checks
        Turns off all checks - use with caution!
        (default: checks on)
       -C <num>
        The size of the cache (a prime number), 0 for full cache and 
        -1 to turn it off.
        (default: 250007)
       -E <num>
        The Exponent to use.
        (default: 1.0)
       -L
        Use lower-order terms.
        (default: no)
      Options after -- are passed to the designated classifier.

      Specified by:
      setOptions in interface OptionHandler
      Overrides:
      setOptions in class RandomizableSingleClassifierEnhancer
      Parameters:
      options - the list of options as an array of strings
      Throws:
      Exception - if an option is not supported
    • getOptions

      public String[] getOptions()
      Gets the current settings of the Classifier.
      Specified by:
      getOptions in interface OptionHandler
      Overrides:
      getOptions in class RandomizableSingleClassifierEnhancer
      Returns:
      an array of strings suitable for passing to setOptions
    • getNumFolds

      public int getNumFolds()
      Gets the number of folds to use for splitting the training set.
      Returns:
      the number of folds
    • setNumFolds

      public void setNumFolds(int value)
      Sets the number of folds to use for splitting the training set.
      Parameters:
      value - the new number of folds
    • numFoldsTipText

      public String numFoldsTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setVerbose

      public void setVerbose(boolean value)
      Set the verbose state.
      Parameters:
      value - the verbose state
    • getVerbose

      public boolean getVerbose()
      Gets the verbose state
      Returns:
      the verbose state
    • verboseTipText

      public String verboseTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • buildClassifier

      public void buildClassifier(Instances data) throws Exception
      Bagging method.
      Specified by:
      buildClassifier in class Classifier
      Parameters:
      data - the training data to be used for generating the bagged classifier.
      Throws:
      Exception - if the classifier could not be built successfully
    • distributionForInstance

      public double[] distributionForInstance(Instance instance) throws Exception
      Calculates the class membership probabilities for the given test instance.
      Overrides:
      distributionForInstance in class Classifier
      Parameters:
      instance - the instance to be classified
      Returns:
      preedicted class probability distribution
      Throws:
      Exception - if distribution can't be computed successfully
    • toString

      public String toString()
      Returns description of the classifier.
      Overrides:
      toString in class Object
      Returns:
      description of the classifier as a string
    • getRevision

      public String getRevision()
      Returns the revision string.
      Specified by:
      getRevision in interface RevisionHandler
      Overrides:
      getRevision in class Classifier
      Returns:
      the revision
    • main

      public static void main(String[] args)
      Main method for testing this class.
      Parameters:
      args - the options