public class Bagging extends RandomizableParallelIteratedSingleClassifierEnhancer implements WeightedInstancesHandler, AdditionalMeasureProducer, TechnicalInformationHandler
@article{Breiman1996,
author = {Leo Breiman},
journal = {Machine Learning},
number = {2},
pages = {123-140},
title = {Bagging predictors},
volume = {24},
year = {1996}
}
Valid options are:
-P Size of each bag, as a percentage of the training set size. (default 100)
-O Calculate the out of bag error.
-S <num> Random number seed. (default 1)
-I <num> Number of iterations. (default 10)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.trees.REPTree)
Options specific to classifier weka.classifiers.trees.REPTree:
-M <minimum number of instances> Set minimum number of instances per leaf (default 2).
-V <minimum variance for split> Set minimum numeric class variance proportion of train variance for split (default 1e-3).
-N <number of folds> Number of folds for reduced error pruning (default 3).
-S <seed> Seed for random data shuffling (default 1).
-P No pruning.
-L Maximum tree depth (default -1, no maximum)Options after -- are passed to the designated classifier.
| Constructor and Description |
|---|
Bagging()
Constructor.
|
| Modifier and Type | Method and Description |
|---|---|
String |
bagSizePercentTipText()
Returns the tip text for this property
|
void |
buildClassifier(Instances data)
Bagging method.
|
String |
calcOutOfBagTipText()
Returns the tip text for this property
|
double[] |
distributionForInstance(Instance instance)
Calculates the class membership probabilities for the given test
instance.
|
Enumeration |
enumerateMeasures()
Returns an enumeration of the additional measure names.
|
int |
getBagSizePercent()
Gets the size of each bag, as a percentage of the training set size.
|
boolean |
getCalcOutOfBag()
Get whether the out of bag error is calculated.
|
double |
getMeasure(String additionalMeasureName)
Returns the value of the named measure.
|
String[] |
getOptions()
Gets the current settings of the Classifier.
|
String |
getRevision()
Returns the revision string.
|
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing
detailed information about the technical background of this class,
e.g., paper reference or book this class is based on.
|
String |
globalInfo()
Returns a string describing classifier
|
Enumeration |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(String[] argv)
Main method for testing this class.
|
double |
measureOutOfBagError()
Gets the out of bag error that was calculated as the classifier
was built.
|
Instances |
resampleWithWeights(Instances data,
Random random,
boolean[] sampled)
Creates a new dataset of the same size using random sampling
with replacement according to the given weight vector.
|
void |
setBagSizePercent(int newBagSizePercent)
Sets the size of each bag, as a percentage of the training set size.
|
void |
setCalcOutOfBag(boolean calcOutOfBag)
Set whether the out of bag error is calculated.
|
void |
setOptions(String[] options)
Parses a given list of options.
|
String |
toString()
Returns description of the bagged classifier.
|
getSeed, seedTipText, setSeedgetNumExecutionSlots, numExecutionSlotsTipText, setNumExecutionSlotsgetNumIterations, numIterationsTipText, setNumIterationsclassifierTipText, getCapabilities, getClassifier, setClassifierclassifyInstance, debugTipText, forName, getDebug, makeCopies, makeCopy, runClassifier, setDebugpublic String globalInfo()
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation in interface TechnicalInformationHandlerpublic Enumeration listOptions()
listOptions in interface OptionHandlerlistOptions in class RandomizableParallelIteratedSingleClassifierEnhancerpublic void setOptions(String[] options) throws Exception
-P Size of each bag, as a percentage of the training set size. (default 100)
-O Calculate the out of bag error.
-S <num> Random number seed. (default 1)
-I <num> Number of iterations. (default 10)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.trees.REPTree)
Options specific to classifier weka.classifiers.trees.REPTree:
-M <minimum number of instances> Set minimum number of instances per leaf (default 2).
-V <minimum variance for split> Set minimum numeric class variance proportion of train variance for split (default 1e-3).
-N <number of folds> Number of folds for reduced error pruning (default 3).
-S <seed> Seed for random data shuffling (default 1).
-P No pruning.
-L Maximum tree depth (default -1, no maximum)Options after -- are passed to the designated classifier.
setOptions in interface OptionHandlersetOptions in class RandomizableParallelIteratedSingleClassifierEnhanceroptions - the list of options as an array of stringsException - if an option is not supportedpublic String[] getOptions()
getOptions in interface OptionHandlergetOptions in class RandomizableParallelIteratedSingleClassifierEnhancerpublic String bagSizePercentTipText()
public int getBagSizePercent()
public void setBagSizePercent(int newBagSizePercent)
newBagSizePercent - the bag size, as a percentage.public String calcOutOfBagTipText()
public void setCalcOutOfBag(boolean calcOutOfBag)
calcOutOfBag - whether to calculate the out of bag errorpublic boolean getCalcOutOfBag()
public double measureOutOfBagError()
public Enumeration enumerateMeasures()
enumerateMeasures in interface AdditionalMeasureProducerpublic double getMeasure(String additionalMeasureName)
getMeasure in interface AdditionalMeasureProduceradditionalMeasureName - the name of the measure to query for its valueIllegalArgumentException - if the named measure is not supportedpublic final Instances resampleWithWeights(Instances data, Random random, boolean[] sampled)
data - the data to be sampled fromrandom - a random number generatorsampled - indicating which instance has been sampledIllegalArgumentException - if the weights array is of the wrong
length or contains negative weights.public void buildClassifier(Instances data) throws Exception
buildClassifier in interface ClassifierbuildClassifier in class ParallelIteratedSingleClassifierEnhancerdata - the training data to be used for generating the
bagged classifier.Exception - if the classifier could not be built successfullypublic double[] distributionForInstance(Instance instance) throws Exception
distributionForInstance in interface ClassifierdistributionForInstance in class AbstractClassifierinstance - the instance to be classifiedException - if distribution can't be computed successfullypublic String toString()
public String getRevision()
getRevision in interface RevisionHandlergetRevision in class AbstractClassifierpublic static void main(String[] argv)
argv - the optionsCopyright © 2012 University of Waikato, Hamilton, NZ. All Rights Reserved.