Class RDG1
java.lang.Object
weka.datagenerators.DataGenerator
weka.datagenerators.ClassificationGenerator
weka.datagenerators.classifiers.classification.RDG1
- All Implemented Interfaces:
Serializable
,OptionHandler
,Randomizable
,RevisionHandler
A data generator that produces data randomly by
producing a decision list.
The decision list consists of rules.
Instances are generated randomly one by one. If decision list fails to classify the current instance, a new rule according to this current instance is generated and added to the decision list.
The option -V switches on voting, which means that at the end of the generation all instances are reclassified to the class value that is supported by the most rules.
This data generator can generate 'boolean' attributes (= nominal with the values {true, false}) and numeric attributes. The rules can be 'A' or 'NOT A' for boolean values and 'B < random_value' or 'B >= random_value' for numeric values. Valid options are:
The decision list consists of rules.
Instances are generated randomly one by one. If decision list fails to classify the current instance, a new rule according to this current instance is generated and added to the decision list.
The option -V switches on voting, which means that at the end of the generation all instances are reclassified to the class value that is supported by the most rules.
This data generator can generate 'boolean' attributes (= nominal with the values {true, false}) and numeric attributes. The rules can be 'A' or 'NOT A' for boolean values and 'B < random_value' or 'B >= random_value' for numeric values. Valid options are:
-h Prints this help.
-o <file> The name of the output file, otherwise the generated data is printed to stdout.
-r <name> The name of the relation.
-d Whether to print debug informations.
-S The seed for random function (default 1)
-n <num> The number of examples to generate (default 100)
-a <num> The number of attributes (default 10).
-c <num> The number of classes (default 2)
-R <num> maximum size for rules (default 10)
-M <num> minimum size for rules (default 1)
-I <num> number of irrelevant attributes (default 0)
-N number of numeric attributes (default 0)
-V switch on voting (default is no voting)Following an example of a generated dataset:
% % weka.datagenerators.RDG1 -r expl -a 2 -c 3 -n 4 -N 1 -I 0 -M 2 -R 10 -S 2 % relation expl attribute a0 {false,true} attribute a1 numeric attribute class {c0,c1,c2} data true,0.496823,c0 false,0.743158,c1 false,0.408285,c1 false,0.993687,c2 % % Number of attributes chosen as irrelevant = 0 % % DECISIONLIST (number of rules = 3): % RULE 0: c0 := a1 < 0.986, a0 % RULE 1: c1 := a1 < 0.95, not(a0) % RULE 2: c2 := not(a0), a1 >= 0.562
- Version:
- $Revision: 10203 $
- Author:
- Gabi Schmidberger (gabi@cs.waikato.ac.nz)
- See Also:
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionReturns the tip text for this propertyInitializes the format for the dataset produced.Generate an example of the dataset dataset.Generate all examples of the dataset.generateExamples
(int num, Random random, Instances format) Generate all examples of the dataset.Compiles documentation about the data generation.Generates a comment string that documentates the data generator.boolean[]
Gets the array that defines which of the attributes are seen to be irrelevant.int
Gets the maximum number of tests in rules.int
Gets the minimum number of tests in rules.int
Gets the number of attributes that should be produced.int
Gets the number of classes the dataset should have.int
Gets the number of irrelevant attributes.int
Gets the number of numerical attributes.String[]
Gets the current settings of the datagenerator RDG1.Returns the revision string.boolean
Gets the single mode flag.boolean
Gets the vote flag.Returns a string describing this data generator.Returns an enumeration describing the available options.static void
Main method for testing this class.Returns the tip text for this propertyReturns the tip text for this propertyReturns the tip text for this propertyReturns the tip text for this propertyReturns the tip text for this propertyReturns the tip text for this propertyvoid
setAttList_Irr
(boolean[] newAttList_Irr) Sets the array that defines which of the attributes are seen to be irrelevant.void
setMaxRuleSize
(int newMaxRuleSize) Sets the maximum number of tests in rules.void
setMinRuleSize
(int newMinRuleSize) Sets the minimum number of tests in rules.void
setNumAttributes
(int numAttributes) Sets the number of attributes the dataset should have.void
setNumClasses
(int numClasses) Sets the number of classes the dataset should have.void
setNumIrrelevant
(int newNumIrrelevant) Sets the number of irrelevant attributes.void
setNumNumeric
(int newNumNumeric) Sets the number of numerical attributes.void
setOptions
(String[] options) Parses a list of options for this object.void
setVoteFlag
(boolean newVoteFlag) Sets the vote flag.Returns the tip text for this propertyMethods inherited from class weka.datagenerators.ClassificationGenerator
getNumExamples, numExamplesTipText, setNumExamples
Methods inherited from class weka.datagenerators.DataGenerator
debugTipText, defaultOutput, enumToVector, formatTipText, getDatasetFormat, getDebug, getEpilogue, getNumExamplesAct, getOutput, getPrologue, getRandom, getRelationName, getSeed, makeData, outputTipText, randomTipText, relationNameTipText, runDataGenerator, seedTipText, setDatasetFormat, setDebug, setOutput, setRandom, setRelationName, setSeed
-
Constructor Details
-
RDG1
public RDG1()initializes the generator with default values
-
-
Method Details
-
globalInfo
Returns a string describing this data generator.- Returns:
- a description of the data generator suitable for displaying in the explorer/experimenter gui
-
listOptions
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceOptionHandler
- Overrides:
listOptions
in classClassificationGenerator
- Returns:
- an enumeration of all the available options
-
setOptions
Parses a list of options for this object. Valid options are:-h Prints this help.
-o <file> The name of the output file, otherwise the generated data is printed to stdout.
-r <name> The name of the relation.
-d Whether to print debug informations.
-S The seed for random function (default 1)
-n <num> The number of examples to generate (default 100)
-a <num> The number of attributes (default 10).
-c <num> The number of classes (default 2)
-R <num> maximum size for rules (default 10)
-M <num> minimum size for rules (default 1)
-I <num> number of irrelevant attributes (default 0)
-N number of numeric attributes (default 0)
-V switch on voting (default is no voting)
- Specified by:
setOptions
in interfaceOptionHandler
- Overrides:
setOptions
in classClassificationGenerator
- Parameters:
options
- the list of options as an array of strings- Throws:
Exception
- if an option is not supported
-
getOptions
Gets the current settings of the datagenerator RDG1.- Specified by:
getOptions
in interfaceOptionHandler
- Overrides:
getOptions
in classClassificationGenerator
- Returns:
- an array of strings suitable for passing to setOptions
- See Also:
-
DataGenerator.removeBlacklist(String[])
-
setNumAttributes
public void setNumAttributes(int numAttributes) Sets the number of attributes the dataset should have.- Parameters:
numAttributes
- the new number of attributes
-
getNumAttributes
public int getNumAttributes()Gets the number of attributes that should be produced.- Returns:
- the number of attributes that should be produced
-
numAttributesTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setNumClasses
public void setNumClasses(int numClasses) Sets the number of classes the dataset should have.- Parameters:
numClasses
- the new number of classes
-
getNumClasses
public int getNumClasses()Gets the number of classes the dataset should have.- Returns:
- the number of classes the dataset should have
-
numClassesTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getMaxRuleSize
public int getMaxRuleSize()Gets the maximum number of tests in rules.- Returns:
- the maximum number of tests allowed in rules
-
setMaxRuleSize
public void setMaxRuleSize(int newMaxRuleSize) Sets the maximum number of tests in rules.- Parameters:
newMaxRuleSize
- new maximum number of tests allowed in rules.
-
maxRuleSizeTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getMinRuleSize
public int getMinRuleSize()Gets the minimum number of tests in rules.- Returns:
- the minimum number of tests allowed in rules
-
setMinRuleSize
public void setMinRuleSize(int newMinRuleSize) Sets the minimum number of tests in rules.- Parameters:
newMinRuleSize
- new minimum number of test in rules.
-
minRuleSizeTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getNumIrrelevant
public int getNumIrrelevant()Gets the number of irrelevant attributes.- Returns:
- the number of irrelevant attributes
-
setNumIrrelevant
public void setNumIrrelevant(int newNumIrrelevant) Sets the number of irrelevant attributes.- Parameters:
newNumIrrelevant
- the number of irrelevant attributes.
-
numIrrelevantTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getNumNumeric
public int getNumNumeric()Gets the number of numerical attributes.- Returns:
- the number of numerical attributes.
-
setNumNumeric
public void setNumNumeric(int newNumNumeric) Sets the number of numerical attributes.- Parameters:
newNumNumeric
- the number of numerical attributes.
-
numNumericTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getVoteFlag
public boolean getVoteFlag()Gets the vote flag.- Returns:
- voting flag.
-
setVoteFlag
public void setVoteFlag(boolean newVoteFlag) Sets the vote flag.- Parameters:
newVoteFlag
- boolean with the new setting of the vote flag.
-
voteFlagTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getSingleModeFlag
public boolean getSingleModeFlag()Gets the single mode flag.- Specified by:
getSingleModeFlag
in classDataGenerator
- Returns:
- true if methode generateExample can be used.
-
getAttList_Irr
public boolean[] getAttList_Irr()Gets the array that defines which of the attributes are seen to be irrelevant.- Returns:
- the array that defines the irrelevant attributes
-
setAttList_Irr
public void setAttList_Irr(boolean[] newAttList_Irr) Sets the array that defines which of the attributes are seen to be irrelevant.- Parameters:
newAttList_Irr
- array that defines the irrelevant attributes.
-
attList_IrrTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
defineDataFormat
Initializes the format for the dataset produced.- Overrides:
defineDataFormat
in classDataGenerator
- Returns:
- the output data format
- Throws:
Exception
- data format could not be defined- See Also:
-
DataGenerator.defaultRelationName()
-
generateExample
Generate an example of the dataset dataset.- Specified by:
generateExample
in classDataGenerator
- Returns:
- the instance generated
- Throws:
Exception
- if format not defined or generating
examples one by one is not possible, because voting is chosen
-
generateExamples
Generate all examples of the dataset.- Specified by:
generateExamples
in classDataGenerator
- Returns:
- the instance generated
- Throws:
Exception
- if format not defined or generating
examples one by one is not possible, because voting is chosen
-
generateExamples
Generate all examples of the dataset.- Parameters:
num
- the number of examples to generaterandom
- the random number generator to useformat
- the dataset format- Returns:
- the instance generated
- Throws:
Exception
- if format not defined or generating
examples one by one is not possible, because voting is chosen
-
generateStart
Generates a comment string that documentates the data generator. By default this string is added at the beginning of the produced output as ARFF file type, next after the options.- Specified by:
generateStart
in classDataGenerator
- Returns:
- string contains info about the generated rules
-
generateFinished
Compiles documentation about the data generation. This is the number of irrelevant attributes and the decisionlist with all rules. Considering that the decisionlist might get enhanced until the last instance is generated, this method should be called at the end of the data generation process.- Specified by:
generateFinished
in classDataGenerator
- Returns:
- string with additional information about generated dataset
- Throws:
Exception
- no input structure has been defined
-
getRevision
Returns the revision string.- Returns:
- the revision
-
main
Main method for testing this class.- Parameters:
args
- should contain arguments for the data producer:
-