public class Bagging extends RandomizableParallelIteratedSingleClassifierEnhancer implements WeightedInstancesHandler, AdditionalMeasureProducer, TechnicalInformationHandler
@article{Breiman1996, author = {Leo Breiman}, journal = {Machine Learning}, number = {2}, pages = {123-140}, title = {Bagging predictors}, volume = {24}, year = {1996} }Valid options are:
-P Size of each bag, as a percentage of the training set size. (default 100)
-O Calculate the out of bag error.
-S <num> Random number seed. (default 1)
-I <num> Number of iterations. (default 10)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.trees.REPTree)
Options specific to classifier weka.classifiers.trees.REPTree:
-M <minimum number of instances> Set minimum number of instances per leaf (default 2).
-V <minimum variance for split> Set minimum numeric class variance proportion of train variance for split (default 1e-3).
-N <number of folds> Number of folds for reduced error pruning (default 3).
-S <seed> Seed for random data shuffling (default 1).
-P No pruning.
-L Maximum tree depth (default -1, no maximum)Options after -- are passed to the designated classifier.
Constructor and Description |
---|
Bagging()
Constructor.
|
Modifier and Type | Method and Description |
---|---|
String |
bagSizePercentTipText()
Returns the tip text for this property
|
void |
buildClassifier(Instances data)
Bagging method.
|
String |
calcOutOfBagTipText()
Returns the tip text for this property
|
double[] |
distributionForInstance(Instance instance)
Calculates the class membership probabilities for the given test
instance.
|
Enumeration |
enumerateMeasures()
Returns an enumeration of the additional measure names.
|
int |
getBagSizePercent()
Gets the size of each bag, as a percentage of the training set size.
|
boolean |
getCalcOutOfBag()
Get whether the out of bag error is calculated.
|
double |
getMeasure(String additionalMeasureName)
Returns the value of the named measure.
|
String[] |
getOptions()
Gets the current settings of the Classifier.
|
String |
getRevision()
Returns the revision string.
|
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing
detailed information about the technical background of this class,
e.g., paper reference or book this class is based on.
|
String |
globalInfo()
Returns a string describing classifier
|
Enumeration |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(String[] argv)
Main method for testing this class.
|
double |
measureOutOfBagError()
Gets the out of bag error that was calculated as the classifier
was built.
|
Instances |
resampleWithWeights(Instances data,
Random random,
boolean[] sampled)
Creates a new dataset of the same size using random sampling
with replacement according to the given weight vector.
|
void |
setBagSizePercent(int newBagSizePercent)
Sets the size of each bag, as a percentage of the training set size.
|
void |
setCalcOutOfBag(boolean calcOutOfBag)
Set whether the out of bag error is calculated.
|
void |
setOptions(String[] options)
Parses a given list of options.
|
String |
toString()
Returns description of the bagged classifier.
|
getSeed, seedTipText, setSeed
getNumExecutionSlots, numExecutionSlotsTipText, setNumExecutionSlots
getNumIterations, numIterationsTipText, setNumIterations
classifierTipText, getCapabilities, getClassifier, setClassifier
classifyInstance, debugTipText, forName, getDebug, makeCopies, makeCopy, runClassifier, setDebug
public String globalInfo()
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation
in interface TechnicalInformationHandler
public Enumeration listOptions()
listOptions
in interface OptionHandler
listOptions
in class RandomizableParallelIteratedSingleClassifierEnhancer
public void setOptions(String[] options) throws Exception
-P Size of each bag, as a percentage of the training set size. (default 100)
-O Calculate the out of bag error.
-S <num> Random number seed. (default 1)
-I <num> Number of iterations. (default 10)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.trees.REPTree)
Options specific to classifier weka.classifiers.trees.REPTree:
-M <minimum number of instances> Set minimum number of instances per leaf (default 2).
-V <minimum variance for split> Set minimum numeric class variance proportion of train variance for split (default 1e-3).
-N <number of folds> Number of folds for reduced error pruning (default 3).
-S <seed> Seed for random data shuffling (default 1).
-P No pruning.
-L Maximum tree depth (default -1, no maximum)Options after -- are passed to the designated classifier.
setOptions
in interface OptionHandler
setOptions
in class RandomizableParallelIteratedSingleClassifierEnhancer
options
- the list of options as an array of stringsException
- if an option is not supportedpublic String[] getOptions()
getOptions
in interface OptionHandler
getOptions
in class RandomizableParallelIteratedSingleClassifierEnhancer
public String bagSizePercentTipText()
public int getBagSizePercent()
public void setBagSizePercent(int newBagSizePercent)
newBagSizePercent
- the bag size, as a percentage.public String calcOutOfBagTipText()
public void setCalcOutOfBag(boolean calcOutOfBag)
calcOutOfBag
- whether to calculate the out of bag errorpublic boolean getCalcOutOfBag()
public double measureOutOfBagError()
public Enumeration enumerateMeasures()
enumerateMeasures
in interface AdditionalMeasureProducer
public double getMeasure(String additionalMeasureName)
getMeasure
in interface AdditionalMeasureProducer
additionalMeasureName
- the name of the measure to query for its valueIllegalArgumentException
- if the named measure is not supportedpublic final Instances resampleWithWeights(Instances data, Random random, boolean[] sampled)
data
- the data to be sampled fromrandom
- a random number generatorsampled
- indicating which instance has been sampledIllegalArgumentException
- if the weights array is of the wrong
length or contains negative weights.public void buildClassifier(Instances data) throws Exception
buildClassifier
in interface Classifier
buildClassifier
in class ParallelIteratedSingleClassifierEnhancer
data
- the training data to be used for generating the
bagged classifier.Exception
- if the classifier could not be built successfullypublic double[] distributionForInstance(Instance instance) throws Exception
distributionForInstance
in interface Classifier
distributionForInstance
in class AbstractClassifier
instance
- the instance to be classifiedException
- if distribution can't be computed successfullypublic String toString()
public String getRevision()
getRevision
in interface RevisionHandler
getRevision
in class AbstractClassifier
public static void main(String[] argv)
argv
- the optionsCopyright © 2012 University of Waikato, Hamilton, NZ. All Rights Reserved.