weka.associations
Class CaRuleGeneration

java.lang.Object
  extended byweka.associations.RuleGeneration
      extended byweka.associations.CaRuleGeneration
All Implemented Interfaces:
java.io.Serializable

public class CaRuleGeneration
extends RuleGeneration
implements java.io.Serializable

Class implementing the rule generation procedure of the predictive apriori algorithm for class association rules. For association rules in gerneral the method is described in: T. Scheffer (2001). Finding Association Rules That Trade Support Optimally against Confidence. Proc of the 5th European Conf. on Principles and Practice of Knowledge Discovery in Databases (PKDD'01), pp. 424-435. Freiburg, Germany: Springer-Verlag.

The implementation follows the paper expect for adding a rule to the output of the n<\i> best rules. A rule is added if: the expected predictive accuracy of this rule is among the n<\i> best and it is not subsumed by a rule with at least the same expected predictive accuracy (out of an unpublished manuscript from T. Scheffer).

Version:
$Revision: 1.1 $
Author:
Stefan Mutter (mutter@cs.waikato.ac.nz)
See Also:
Serialized Form

Constructor Summary
CaRuleGeneration(ItemSet itemSet)
          Constructor
 
Method Summary
static boolean aSubsumesB(RuleItem a, RuleItem b)
          Methods that decides whether or not rule a subsumes rule b.
 java.util.TreeSet generateRules(int numRules, double[] midPoints, java.util.Hashtable priors, double expectation, Instances instances, java.util.TreeSet best, int genTime)
          Generates all rules for an item set.
static FastVector singleConsequence(Instances instances)
          generates a consequence of length 1 for a class association rule.
static FastVector singletons(Instances instances)
          Converts the header info of the given set of instances into a set of item sets (singletons).
 
Methods inherited from class weka.associations.RuleGeneration
binomialDistribution, change, count, expectation, removeRedundant, singleConsequence
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CaRuleGeneration

public CaRuleGeneration(ItemSet itemSet)
Constructor

Parameters:
itemSet - the item set that forms the premise of the rule
Method Detail

generateRules

public java.util.TreeSet generateRules(int numRules,
                                       double[] midPoints,
                                       java.util.Hashtable priors,
                                       double expectation,
                                       Instances instances,
                                       java.util.TreeSet best,
                                       int genTime)
Generates all rules for an item set. The item set is the premise.

Overrides:
generateRules in class RuleGeneration
Parameters:
numRules - the number of association rules the use wants to mine. This number equals the size n<\i> of the list of the best rules.
midPoints - the mid points of the intervals
priors - Hashtable that contains the prior probabilities
expectation - the minimum value of the expected predictive accuracy that is needed to get into the list of the best rules
instances - the instances for which association rules are generated
best - the list of the n<\i> best rules. The list is implemented as a TreeSet
genTime - the maximum time of generation
Returns:
all the rules with minimum confidence for the given item set

aSubsumesB

public static boolean aSubsumesB(RuleItem a,
                                 RuleItem b)
Methods that decides whether or not rule a subsumes rule b. The defintion of subsumption is: Rule a subsumes rule b, if a subsumes b AND a has got least the same expected predictive accuracy as b.

Parameters:
a - an association rule stored as a RuleItem
b - an association rule stored as a RuleItem
Returns:
true if rule a subsumes rule b or false otherwise.

singletons

public static FastVector singletons(Instances instances)
                             throws java.lang.Exception
Converts the header info of the given set of instances into a set of item sets (singletons). The ordering of values in the header file determines the lexicographic order.

Parameters:
instances - the set of instances whose header info is to be used
Returns:
a set of item sets, each containing a single item
Throws:
java.lang.Exception - if singletons can't be generated successfully

singleConsequence

public static FastVector singleConsequence(Instances instances)
generates a consequence of length 1 for a class association rule.

Parameters:
instances - the instances under consideration
Returns:
FastVector with consequences of length 1