Class PriorEstimation

java.lang.Object
weka.associations.PriorEstimation
All Implemented Interfaces:
Serializable, RevisionHandler

public class PriorEstimation extends Object implements Serializable, RevisionHandler
Class implementing the prior estimattion of the predictive apriori algorithm for mining association rules. Reference: T. Scheffer (2001). Finding Association Rules That Trade Support Optimally against Confidence. Proc of the 5th European Conf. on Principles and Practice of Knowledge Discovery in Databases (PKDD'01), pp. 424-435. Freiburg, Germany: Springer-Verlag.

Version:
$Revision: 1.7 $
Author:
Stefan Mutter (mutter@cs.waikato.ac.nz)
See Also:
  • Constructor Summary

    Constructors
    Constructor
    Description
    PriorEstimation(Instances instances, int numRules, int numIntervals, boolean car)
    Constructor
  • Method Summary

    Modifier and Type
    Method
    Description
    final RuleItem
    addCons(int[] itemArray)
    generates a class association rule out of a given premise.
    final void
    buildDistribution(double conf, double length)
    updates the distribution of the confidence values.
    final double
    calculatePriorSum(boolean weighted, double mPoint)
    calculates the numerator and the denominator of the prior equation
    final Hashtable
    Method to estimate the prior probabilities
    final double
    findIntervall(double conf)
    searches the mid point of the interval a given confidence value falls into
    final void
    Calculates the prior distribution.
    final double[]
    returns an ordered array of all mid points
    Returns the revision string.
    static final double
    logbinomialCoefficient(int upperIndex, int lowerIndex)
    Method that calculates the base 2 logarithm of a binomial coefficient
    double
    midPoint(double size, int number)
    calculates the mid point of an interval
    final void
    split the interval [0,1] into a predefined number of intervals and calculates their mid points
    final int[]
    randomCARule(int maxLength, int actualLength, Random randNum)
    Constructs an item set of certain length randomly.
    final int[]
    randomRule(int maxLength, int actualLength, Random randNum)
    Constructs an item set of certain length randomly.
    final RuleItem
    splitItemSet(int premiseLength, int[] itemArray)
    splits an item set into premise and consequence and constructs therefore an association rule.
    final void
    updates the support count of an item set

    Methods inherited from class java.lang.Object

    equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • PriorEstimation

      public PriorEstimation(Instances instances, int numRules, int numIntervals, boolean car)
      Constructor
      Parameters:
      instances - the instances to be used for generating the associations
      numRules - the number of random rules used for generating the prior
      numIntervals - the number of intervals to discretise [0,1]
      car - flag indicating whether standard or class association rules are mined
  • Method Details

    • generateDistribution

      public final void generateDistribution() throws Exception
      Calculates the prior distribution.
      Throws:
      Exception - if prior can't be estimated successfully
    • randomRule

      public final int[] randomRule(int maxLength, int actualLength, Random randNum)
      Constructs an item set of certain length randomly. This method is used for standard association rule mining.
      Parameters:
      maxLength - the number of attributes of the instances
      actualLength - the number of attributes that should be present in the item set
      randNum - the random number generator
      Returns:
      a randomly constructed item set in form of an int array
    • randomCARule

      public final int[] randomCARule(int maxLength, int actualLength, Random randNum)
      Constructs an item set of certain length randomly. This method is used for class association rule mining.
      Parameters:
      maxLength - the number of attributes of the instances
      actualLength - the number of attributes that should be present in the item set
      randNum - the random number generator
      Returns:
      a randomly constructed item set in form of an int array
    • buildDistribution

      public final void buildDistribution(double conf, double length)
      updates the distribution of the confidence values. For every confidence value the interval to which it belongs is searched and the confidence is added to the confidence already found in this interval.
      Parameters:
      conf - the confidence of the randomly created rule
      length - the legnth of the randomly created rule
    • findIntervall

      public final double findIntervall(double conf)
      searches the mid point of the interval a given confidence value falls into
      Parameters:
      conf - the confidence of a rule
      Returns:
      the mid point of the interval the confidence belongs to
    • calculatePriorSum

      public final double calculatePriorSum(boolean weighted, double mPoint)
      calculates the numerator and the denominator of the prior equation
      Parameters:
      weighted - indicates whether the numerator or the denominator is calculated
      mPoint - the mid Point of an interval
      Returns:
      the numerator or denominator of the prior equation
    • logbinomialCoefficient

      public static final double logbinomialCoefficient(int upperIndex, int lowerIndex)
      Method that calculates the base 2 logarithm of a binomial coefficient
      Parameters:
      upperIndex - upper Inedx of the binomial coefficient
      lowerIndex - lower index of the binomial coefficient
      Returns:
      the base 2 logarithm of the binomial coefficient
    • estimatePrior

      public final Hashtable estimatePrior() throws Exception
      Method to estimate the prior probabilities
      Returns:
      a hashtable containing the prior probabilities
      Throws:
      Exception - throws exception if the prior cannot be calculated
    • midPoints

      public final void midPoints()
      split the interval [0,1] into a predefined number of intervals and calculates their mid points
    • midPoint

      public double midPoint(double size, int number)
      calculates the mid point of an interval
      Parameters:
      size - the size of each interval
      number - the number of the interval. The intervals are numbered from 0 to m_numIntervals.
      Returns:
      the mid point of the interval
    • getMidPoints

      public final double[] getMidPoints()
      returns an ordered array of all mid points
      Returns:
      an ordered array of doubles conatining all midpoints
    • splitItemSet

      public final RuleItem splitItemSet(int premiseLength, int[] itemArray)
      splits an item set into premise and consequence and constructs therefore an association rule. The length of the premise is given. The attributes for premise and consequence are chosen randomly. The result is a RuleItem.
      Parameters:
      premiseLength - the length of the premise
      itemArray - a (randomly generated) item set
      Returns:
      a randomly generated association rule stored in a RuleItem
    • addCons

      public final RuleItem addCons(int[] itemArray)
      generates a class association rule out of a given premise. It randomly chooses a class label as consequence.
      Parameters:
      itemArray - the (randomly constructed) premise of the class association rule
      Returns:
      a class association rule stored in a RuleItem
    • updateCounters

      public final void updateCounters(ItemSet itemSet)
      updates the support count of an item set
      Parameters:
      itemSet - the item set
    • getRevision

      public String getRevision()
      Returns the revision string.
      Specified by:
      getRevision in interface RevisionHandler
      Returns:
      the revision