Package weka.associations
Class AprioriItemSet
java.lang.Object
weka.associations.ItemSet
weka.associations.AprioriItemSet
- All Implemented Interfaces:
Serializable
,RevisionHandler
Class for storing a set of items. Item sets are stored in a lexicographic
order, which is determined by the header information of the set of instances
used for generating the set of items. All methods in this class assume that
item sets are stored in lexicographic order. The class provides methods that
are used in the Apriori algorithm to construct association rules.
- Version:
- $Revision: 12014 $
- Author:
- Eibe Frank (eibe@cs.waikato.ac.nz), Stefan Mutter (mutter@cs.waikato.ac.nz)
- See Also:
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic double
confidenceForRule
(AprioriItemSet premise, AprioriItemSet consequence) Outputs the confidence for a rule.double
convictionForRule
(AprioriItemSet premise, AprioriItemSet consequence, int premiseCount, int consequenceCount) Outputs the conviction for a rule.generateRules
(double minConfidence, ArrayList<Hashtable<ItemSet, Integer>> hashtables, int numItemsInSet) Generates all rules for an item set.generateRulesBruteForce
(double minMetric, int metricType, ArrayList<Hashtable<ItemSet, Integer>> hashtables, int numItemsInSet, int numTransactions, double significanceLevel) Generates all significant rules for an item set.Returns the revision string.double
leverageForRule
(AprioriItemSet premise, AprioriItemSet consequence, int premiseCount, int consequenceCount) Outputs the leverage for a rule.double
liftForRule
(AprioriItemSet premise, AprioriItemSet consequence, int consequenceCount) Outputs the lift for a rule.mergeAllItemSets
(ArrayList<Object> itemSets, int size, int totalTrans) Merges all item sets in the set of (k-1)-item sets to create the (k)-item sets and updates the counters.singletons
(Instances instances, boolean treatZeroAsMissing) Converts the header info of the given set of instances into a set of item sets (singletons).final AprioriItemSet
subtract
(AprioriItemSet toSubtract) Subtracts an item set from another one.final String
Returns the contents of an item set as a string.Methods inherited from class weka.associations.ItemSet
containedBy, containedByTreatZeroAsMissing, counter, deleteItemSets, equals, getHashtable, getItems, getTotalTransactions, hashCode, itemAt, items, pruneItemSets, pruneRules, setCounter, setItem, setItemAt, singletons, support, toString, upDateCounter, upDateCounters, upDateCountersTreatZeroAsMissing, updateCounterTreatZeroAsMissing
-
Constructor Details
-
AprioriItemSet
public AprioriItemSet(int totalTrans) Constructor- Parameters:
totalTrans
- the total number of transactions in the data
-
-
Method Details
-
confidenceForRule
Outputs the confidence for a rule.- Parameters:
premise
- the premise of the ruleconsequence
- the consequence of the rule- Returns:
- the confidence on the training data
-
liftForRule
Outputs the lift for a rule. Lift is defined as:
confidence / prob(consequence)- Parameters:
premise
- the premise of the ruleconsequence
- the consequence of the ruleconsequenceCount
- how many times the consequence occurs independent of the premise- Returns:
- the lift on the training data
-
leverageForRule
public double leverageForRule(AprioriItemSet premise, AprioriItemSet consequence, int premiseCount, int consequenceCount) Outputs the leverage for a rule. Leverage is defined as:
prob(premise & consequence) - (prob(premise) * prob(consequence))- Parameters:
premise
- the premise of the ruleconsequence
- the consequence of the rulepremiseCount
- how many times the premise occurs independent of the consequentconsequenceCount
- how many times the consequence occurs independent of the premise- Returns:
- the leverage on the training data
-
convictionForRule
public double convictionForRule(AprioriItemSet premise, AprioriItemSet consequence, int premiseCount, int consequenceCount) Outputs the conviction for a rule. Conviction is defined as:
prob(premise) * prob(!consequence) / prob(premise & !consequence)- Parameters:
premise
- the premise of the ruleconsequence
- the consequence of the rulepremiseCount
- how many times the premise occurs independent of the consequentconsequenceCount
- how many times the consequence occurs independent of the premise- Returns:
- the conviction on the training data
-
generateRules
public ArrayList<Object>[] generateRules(double minConfidence, ArrayList<Hashtable<ItemSet, Integer>> hashtables, int numItemsInSet) Generates all rules for an item set.- Parameters:
minConfidence
- the minimum confidence the rules have to havehashtables
- containing all(!) previously generated item setsnumItemsInSet
- the size of the item set for which the rules are to be generated- Returns:
- all the rules with minimum confidence for the given item set
-
generateRulesBruteForce
public final ArrayList<Object>[] generateRulesBruteForce(double minMetric, int metricType, ArrayList<Hashtable<ItemSet, Integer>> hashtables, int numItemsInSet, int numTransactions, double significanceLevel) throws ExceptionGenerates all significant rules for an item set.- Parameters:
minMetric
- the minimum metric (confidence, lift, leverage, improvement) the rules have to havemetricType
- (confidence=0, lift, leverage, improvement)hashtables
- containing all(!) previously generated item setsnumItemsInSet
- the size of the item set for which the rules are to be generatednumTransactions
-significanceLevel
- the significance level for testing the rules- Returns:
- all the rules with minimum metric for the given item set
- Throws:
Exception
- if something goes wrong
-
subtract
Subtracts an item set from another one.- Parameters:
toSubtract
- the item set to be subtracted from this one.- Returns:
- an item set that only contains items form this item sets that are not contained by toSubtract
-
toString
Returns the contents of an item set as a string. -
singletons
public static ArrayList<Object> singletons(Instances instances, boolean treatZeroAsMissing) throws Exception Converts the header info of the given set of instances into a set of item sets (singletons). The ordering of values in the header file determines the lexicographic order.- Parameters:
instances
- the set of instances whose header info is to be used- Returns:
- a set of item sets, each containing a single item
- Throws:
Exception
- if singletons can't be generated successfully
-
mergeAllItemSets
public static ArrayList<Object> mergeAllItemSets(ArrayList<Object> itemSets, int size, int totalTrans) Merges all item sets in the set of (k-1)-item sets to create the (k)-item sets and updates the counters.- Parameters:
itemSets
- the set of (k-1)-item setssize
- the value of (k-1)totalTrans
- the total number of transactions in the data- Returns:
- the generated (k)-item sets
-
getRevision
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Overrides:
getRevision
in classItemSet
- Returns:
- the revision
-