Class DBSCAN

All Implemented Interfaces:
Serializable, Cloneable, Clusterer, CapabilitiesHandler, OptionHandler, RevisionHandler, TechnicalInformationHandler

public class DBSCAN extends AbstractClusterer implements OptionHandler, TechnicalInformationHandler
Basic implementation of DBSCAN clustering algorithm that should *not* be used as a reference for runtime benchmarks: more sophisticated implementations exist! Clustering of new instances is not supported. More info:

Martin Ester, Hans-Peter Kriegel, Joerg Sander, Xiaowei Xu: A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In: Second International Conference on Knowledge Discovery and Data Mining, 226-231, 1996.

BibTeX:

 @inproceedings{Ester1996,
    author = {Martin Ester and Hans-Peter Kriegel and Joerg Sander and Xiaowei Xu},
    booktitle = {Second International Conference on Knowledge Discovery and Data Mining},
    editor = {Evangelos Simoudis and Jiawei Han and Usama M. Fayyad},
    pages = {226-231},
    publisher = {AAAI Press},
    title = {A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise},
    year = {1996}
 }
 

Valid options are:

 -E <double>
  epsilon (default = 0.9)
 -M <int>
  minPoints (default = 6)
 -I <String>
  index (database) used for DBSCAN (default = weka.clusterers.forOPTICSAndDBScan.Databases.SequentialDatabase)
 -D <String>
  distance-type (default = weka.clusterers.forOPTICSAndDBScan.DataObjects.EuclideanDataObject)
Version:
$Revision: 9434 $
Author:
Matthias Schubert (schubert@dbs.ifi.lmu.de), Zhanna Melnikova-Albrecht (melnikov@cip.ifi.lmu.de), Rainer Holzmann (holzmann@cip.ifi.lmu.de)
See Also:
  • Constructor Details

    • DBSCAN

      public DBSCAN()
  • Method Details

    • getCapabilities

      public Capabilities getCapabilities()
      Returns default capabilities of the clusterer.
      Specified by:
      getCapabilities in interface CapabilitiesHandler
      Specified by:
      getCapabilities in interface Clusterer
      Overrides:
      getCapabilities in class AbstractClusterer
      Returns:
      the capabilities of this clusterer
      See Also:
    • buildClusterer

      public void buildClusterer(Instances instances) throws Exception
      Generate Clustering via DBSCAN
      Specified by:
      buildClusterer in interface Clusterer
      Specified by:
      buildClusterer in class AbstractClusterer
      Parameters:
      instances - The instances that need to be clustered
      Throws:
      Exception - If clustering was not successful
    • clusterInstance

      public int clusterInstance(Instance instance) throws Exception
      Classifies a given instance.
      Specified by:
      clusterInstance in interface Clusterer
      Overrides:
      clusterInstance in class AbstractClusterer
      Parameters:
      instance - The instance to be assigned to a cluster
      Returns:
      int The number of the assigned cluster as an integer
      Throws:
      Exception - If instance could not be clustered successfully
    • numberOfClusters

      public int numberOfClusters() throws Exception
      Returns the number of clusters.
      Specified by:
      numberOfClusters in interface Clusterer
      Specified by:
      numberOfClusters in class AbstractClusterer
      Returns:
      int The number of clusters generated for a training dataset.
      Throws:
      Exception - if number of clusters could not be returned successfully
    • listOptions

      public Enumeration listOptions()
      Returns an enumeration of all the available options..
      Specified by:
      listOptions in interface OptionHandler
      Returns:
      Enumeration An enumeration of all available options.
    • setOptions

      public void setOptions(String[] options) throws Exception
      Sets the OptionHandler's options using the given list. All options will be set (or reset) during this call (i.e. incremental setting of options is not possible).

      Valid options are:

       -E <double>
        epsilon (default = 0.9)
       -M <int>
        minPoints (default = 6)
       -I <String>
        index (database) used for DBSCAN (default = weka.clusterers.forOPTICSAndDBScan.Databases.SequentialDatabase)
       -D <String>
        distance-type (default = weka.clusterers.forOPTICSAndDBScan.DataObjects.EuclideanDataObject)
      Specified by:
      setOptions in interface OptionHandler
      Parameters:
      options - The list of options as an array of strings
      Throws:
      Exception - If an option is not supported
    • getOptions

      public String[] getOptions()
      Gets the current option settings for the OptionHandler.
      Specified by:
      getOptions in interface OptionHandler
      Returns:
      String[] The list of current option settings as an array of strings
    • databaseForName

      public Database databaseForName(String database_Type, Instances instances)
      Returns a new Class-Instance of the specified database
      Parameters:
      database_Type - String of the specified database
      instances - Instances that were delivered from WEKA
      Returns:
      Database New constructed Database
    • dataObjectForName

      public DataObject dataObjectForName(String database_distanceType, Instance instance, String key, Database database)
      Returns a new Class-Instance of the specified database
      Parameters:
      database_distanceType - String of the specified distance-type
      instance - The original instance that needs to hold by this DataObject
      key - Key for this DataObject
      database - Link to the database
      Returns:
      DataObject New constructed DataObject
    • setMinPoints

      public void setMinPoints(int minPoints)
      Sets a new value for minPoints
      Parameters:
      minPoints - MinPoints
    • setEpsilon

      public void setEpsilon(double epsilon)
      Sets a new value for epsilon
      Parameters:
      epsilon - Epsilon
    • getEpsilon

      public double getEpsilon()
      Returns the value of epsilon
      Returns:
      double Epsilon
    • getMinPoints

      public int getMinPoints()
      Returns the value of minPoints
      Returns:
      int MinPoints
    • getDatabase_distanceType

      public String getDatabase_distanceType()
      Returns the distance-type
      Returns:
      String Distance-type
    • getDatabase_Type

      public String getDatabase_Type()
      Returns the type of the used index (database)
      Returns:
      String Index-type
    • setDatabase_distanceType

      public void setDatabase_distanceType(String database_distanceType)
      Sets a new distance-type
      Parameters:
      database_distanceType - The new distance-type
    • setDatabase_Type

      public void setDatabase_Type(String database_Type)
      Sets a new database-type
      Parameters:
      database_Type - The new database-type
    • epsilonTipText

      public String epsilonTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • minPointsTipText

      public String minPointsTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • database_TypeTipText

      public String database_TypeTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • database_distanceTypeTipText

      public String database_distanceTypeTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • globalInfo

      public String globalInfo()
      Returns a string describing this DataMining-Algorithm
      Returns:
      String Information for the gui-explorer
    • getTechnicalInformation

      public TechnicalInformation getTechnicalInformation()
      Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
      Specified by:
      getTechnicalInformation in interface TechnicalInformationHandler
      Returns:
      the technical information about this class
    • toString

      public String toString()
      Returns a description of the clusterer
      Overrides:
      toString in class Object
      Returns:
      a string representation of the clusterer
    • getRevision

      public String getRevision()
      Returns the revision string.
      Specified by:
      getRevision in interface RevisionHandler
      Overrides:
      getRevision in class AbstractClusterer
      Returns:
      the revision
    • main

      public static void main(String[] args)
      Main Method for testing DBSCAN
      Parameters:
      args - Valid parameters are: 'E' epsilon (default = 0.9); 'M' minPoints (default = 6); 'I' index-type (default = weka.clusterers.forOPTICSAndDBScan.Databases.SequentialDatabase); 'D' distance-type (default = weka.clusterers.forOPTICSAndDBScan.DataObjects.EuclideanDataObject);