public class FPGrowth extends AbstractAssociator implements AssociationRulesProducer, OptionHandler, TechnicalInformationHandler
@inproceedings{Han2000, author = {J. Han and J.Pei and Y. Yin}, booktitle = {Proceedings of the 2000 ACM-SIGMID International Conference on Management of Data}, pages = {1-12}, title = {Mining frequent patterns without candidate generation}, year = {2000} }Valid options are:
-P <attribute index of positive value> Set the index of the attribute value to consider as 'positive' for binary attributes in normal dense instances. Index 2 is always used for sparse instances. (default = 2)
-I <max items> The maximum number of items to include in large items sets (and rules). (default = -1, i.e. no limit.)
-N <require number of rules> The required number of rules. (default = 10)
-T <0=confidence | 1=lift | 2=leverage | 3=Conviction> The metric by which to rank rules. (default = confidence)
-C <minimum metric score of a rule> The minimum metric score of a rule. (default = 0.9)
-U <upper bound for minimum support> Upper bound for minimum support. (default = 1.0)
-M <lower bound for minimum support> The lower bound for the minimum support. (default = 0.1)
-D <delta for minimum support> The delta by which the minimum support is decreased in each iteration. (default = 0.05)
-S Find all rules that meet the lower bound on minimum support and the minimum metric constraint. Turning this mode on will disable the iterative support reduction procedure to find the specified number of rules.
-transactions <comma separated list of attribute names> Only consider transactions that contain these items (default = no restriction)
-rules <comma separated list of attribute names> Only print rules that contain these items. (default = no restriction)
-use-or Use OR instead of AND for must contain list(s). Use in conjunction with -transactions and/or -rules
Constructor and Description |
---|
FPGrowth()
Construct a new FPGrowth object.
|
Modifier and Type | Method and Description |
---|---|
void |
buildAssociations(Instances data)
Method that generates all large item sets with a minimum support, and from
these all association rules with a minimum metric (i.e.
|
boolean |
canProduceRules()
Returns true if this AssociationRulesProducer can actually
produce rules.
|
String |
deltaTipText()
Returns the tip text for this property
|
String |
findAllRulesForSupportLevelTipText()
Tip text for this property suitable for displaying
in the GUI.
|
static List<AssociationRule> |
generateRulesBruteForce(weka.associations.FPGrowth.FrequentItemSets largeItemSets,
DefaultAssociationRule.METRIC_TYPE metricToUse,
double metricThreshold,
int upperBoundMinSuppAsInstances,
int lowerBoundMinSuppAsInstances,
int totalTransactions)
Generate all association rules, from the supplied frequet item sets,
that meet a given minimum metric threshold.
|
AssociationRules |
getAssociationRules()
Gets the list of mined association rules.
|
Capabilities |
getCapabilities()
Returns default capabilities of the classifier.
|
double |
getDelta()
Get the value of delta.
|
boolean |
getFindAllRulesForSupportLevel()
Get whether all rules meeting the lower bound on min support
and the minimum metric threshold are to be found.
|
double |
getLowerBoundMinSupport()
Get the value of lowerBoundMinSupport.
|
int |
getMaxNumberOfItems()
Gets the maximum number of items to be included in large item sets.
|
SelectedTag |
getMetricType()
Get the metric type to use.
|
double |
getMinMetric()
Get the value of minConfidence.
|
int |
getNumRulesToFind()
Get the number of rules to find.
|
String[] |
getOptions()
Gets the current settings of the classifier.
|
int |
getPositiveIndex()
Get the index of the attribute value to consider as positive
for binary attributes in normal dense instances.
|
String |
getRevision()
Returns the revision string.
|
String[] |
getRuleMetricNames()
Gets a list of the names of the metrics output for
each rule.
|
String |
getRulesMustContain()
Get the comma separated list of items that
rules must contain in order to be output.
|
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing
detailed information about the technical background of this class,
e.g., paper reference or book this class is based on.
|
String |
getTransactionsMustContain()
Gets the comma separated list of items that
transactions must contain in order to be considered
for large item sets and rules.
|
double |
getUpperBoundMinSupport()
Get the value of upperBoundMinSupport.
|
boolean |
getUseORForMustContainList()
Gets whether OR is to be used rather than AND when
considering must contain lists.
|
String |
globalInfo()
Returns a string describing this associator
|
String |
graph(weka.associations.FPGrowth.FPTreeRoot tree)
Assemble a dot graph representation of the FP-tree.
|
Enumeration<Option> |
listOptions()
Returns an enumeration describing the available options.
|
String |
lowerBoundMinSupportTipText()
Returns the tip text for this property
|
static void |
main(String[] args)
Main method.
|
String |
maxNumberOfItemsTipText()
Tip text for this property suitable for displaying
in the GUI.
|
String |
metricTypeTipText()
Tip text for this property suitable for displaying
in the GUI.
|
String |
minMetricTipText()
Returns the tip text for this property
|
String |
numRulesToFindTipText()
Tip text for this property suitable for displaying
in the GUI.
|
String |
positiveIndexTipText()
Tip text for this property suitable for displaying
in the GUI.
|
static List<AssociationRule> |
pruneRules(List<AssociationRule> rulesToPrune,
ArrayList<Item> itemsToConsider,
boolean useOr) |
void |
resetOptions()
Reset all options to their default values.
|
String |
rulesMustContainTipText()
Returns the tip text for this property
|
void |
setDelta(double v)
Set the value of delta.
|
void |
setFindAllRulesForSupportLevel(boolean s)
If true then turn off the iterative support reduction method
of finding x rules that meet the minimum support and metric
thresholds and just return all the rules that meet the
lower bound on minimum support and the minimum metric.
|
void |
setLowerBoundMinSupport(double v)
Set the value of lowerBoundMinSupport.
|
void |
setMaxNumberOfItems(int max)
Set the maximum number of items to include in large items sets.
|
void |
setMetricType(SelectedTag d)
Set the metric type to use.
|
void |
setMinMetric(double v)
Set the value of minConfidence.
|
void |
setNumRulesToFind(int numR)
Set the desired number of rules to find.
|
void |
setOffDiskReportingFrequency(int freq)
Set how often to report some progress when the data is
being read incrementally off of the disk rather than
loaded into memory.
|
void |
setOptions(String[] options)
Parses a given list of options.
|
void |
setPositiveIndex(int index)
Set the index of the attribute value to consider as positive
for binary attributes in normal dense instances.
|
void |
setRulesMustContain(String list)
Set the comma separated list of items that rules
must contain in order to be output.
|
void |
setTransactionsMustContain(String list)
Set the comma separated list of items that transactions
must contain in order to be considered for large
item sets and rules.
|
void |
setUpperBoundMinSupport(double v)
Set the value of upperBoundMinSupport.
|
void |
setUseORForMustContainList(boolean b)
Set whether to use OR rather than AND when considering
must contain lists.
|
String |
toString()
Output the association rules.
|
String |
transactionsMustContainTipText()
Returns the tip text for this property
|
String |
upperBoundMinSupportTipText()
Returns the tip text for this property
|
String |
useORForMustContainListTipText()
Returns the tip text for this property
|
forName, makeCopies, makeCopy, runAssociator
public static List<AssociationRule> generateRulesBruteForce(weka.associations.FPGrowth.FrequentItemSets largeItemSets, DefaultAssociationRule.METRIC_TYPE metricToUse, double metricThreshold, int upperBoundMinSuppAsInstances, int lowerBoundMinSuppAsInstances, int totalTransactions)
largeItemSets
- the set of frequent item setsmetricToUse
- the metric to usemetricThreshold
- the threshold value that a rule must meetupperBoundMinSuppAsInstances
- the upper bound on the support
in order to accept the rulelowerBoundMinSuppAsInstances
- the lower bound on the support
in order to accept the ruletotalTransactions
- the total number of transactions in the datapublic static List<AssociationRule> pruneRules(List<AssociationRule> rulesToPrune, ArrayList<Item> itemsToConsider, boolean useOr)
public Capabilities getCapabilities()
getCapabilities
in interface Associator
getCapabilities
in interface CapabilitiesHandler
getCapabilities
in class AbstractAssociator
Capabilities
public String globalInfo()
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation
in interface TechnicalInformationHandler
public void resetOptions()
public String positiveIndexTipText()
public void setPositiveIndex(int index)
index
- the index to use for positive values in binary attributes.public int getPositiveIndex()
public void setNumRulesToFind(int numR)
numR
- the number of rules to find.public int getNumRulesToFind()
public String numRulesToFindTipText()
public void setMetricType(SelectedTag d)
d
- the metric typepublic void setMaxNumberOfItems(int max)
max
- the maxim number of items to include in large item sets.public int getMaxNumberOfItems()
public String maxNumberOfItemsTipText()
public SelectedTag getMetricType()
public String metricTypeTipText()
public String minMetricTipText()
public double getMinMetric()
public void setMinMetric(double v)
v
- Value to assign to minConfidence.public String transactionsMustContainTipText()
public void setTransactionsMustContain(String list)
list
- a comma separated list of items (empty
string indicates no restriction on the transactions).public String getTransactionsMustContain()
public String rulesMustContainTipText()
public void setRulesMustContain(String list)
list
- a comma separated list of items (empty
string indicates no restriction on the rules).public String getRulesMustContain()
public String useORForMustContainListTipText()
public void setUseORForMustContainList(boolean b)
b
- true if OR should be used instead of AND when
considering transaction and rules must contain lists.public boolean getUseORForMustContainList()
public String deltaTipText()
public double getDelta()
public void setDelta(double v)
v
- Value to assign to delta.public String lowerBoundMinSupportTipText()
public double getLowerBoundMinSupport()
public void setLowerBoundMinSupport(double v)
v
- Value to assign to lowerBoundMinSupport.public String upperBoundMinSupportTipText()
public double getUpperBoundMinSupport()
public void setUpperBoundMinSupport(double v)
v
- Value to assign to upperBoundMinSupport.public String findAllRulesForSupportLevelTipText()
public void setFindAllRulesForSupportLevel(boolean s)
s
- true if all rules meeting the lower bound on the support
and minimum metric thresholds are to be found.public boolean getFindAllRulesForSupportLevel()
public void setOffDiskReportingFrequency(int freq)
freq
- the frequency to print progress.public AssociationRules getAssociationRules()
getAssociationRules
in interface AssociationRulesProducer
public String[] getRuleMetricNames()
getRuleMetricNames
in interface AssociationRulesProducer
public boolean canProduceRules()
canProduceRules
in interface AssociationRulesProducer
public Enumeration<Option> listOptions()
listOptions
in interface OptionHandler
public void setOptions(String[] options) throws Exception
-P <attribute index of positive value> Set the index of the attribute value to consider as 'positive' for binary attributes in normal dense instances. Index 2 is always used for sparse instances. (default = 2)
-I <max items> The maximum number of items to include in large items sets (and rules). (default = -1, i.e. no limit.)
-N <require number of rules> The required number of rules. (default = 10)
-T <0=confidence | 1=lift | 2=leverage | 3=Conviction> The metric by which to rank rules. (default = confidence)
-C <minimum metric score of a rule> The minimum metric score of a rule. (default = 0.9)
-U <upper bound for minimum support> Upper bound for minimum support. (default = 1.0)
-M <lower bound for minimum support> The lower bound for the minimum support. (default = 0.1)
-D <delta for minimum support> The delta by which the minimum support is decreased in each iteration. (default = 0.05)
-S Find all rules that meet the lower bound on minimum support and the minimum metric constraint. Turning this mode on will disable the iterative support reduction procedure to find the specified number of rules.
-transactions <comma separated list of attribute names> Only consider transactions that contain these items (default = no restriction)
-rules <comma separated list of attribute names> Only print rules that contain these items. (default = no restriction)
-use-or Use OR instead of AND for must contain list(s). Use in conjunction with -transactions and/or -rules
setOptions
in interface OptionHandler
options
- the list of options as an array of stringsException
- if an option is not supportedpublic String[] getOptions()
getOptions
in interface OptionHandler
public void buildAssociations(Instances data) throws Exception
buildAssociations
in interface Associator
data
- the instances to be used for generating the associationsException
- if rules can't be built successfullypublic String toString()
public String graph(weka.associations.FPGrowth.FPTreeRoot tree)
tree
- the root of the FP-treepublic String getRevision()
getRevision
in interface RevisionHandler
getRevision
in class AbstractAssociator
public static void main(String[] args)
args
- the commandline optionsCopyright © 2012 University of Waikato, Hamilton, NZ. All Rights Reserved.