All Implemented Interfaces:
Serializable, OptionHandler, Randomizable, RevisionHandler

public class RDG1 extends ClassificationGenerator
A data generator that produces data randomly by producing a decision list.
The decision list consists of rules.
Instances are generated randomly one by one. If decision list fails to classify the current instance, a new rule according to this current instance is generated and added to the decision list.

The option -V switches on voting, which means that at the end of the generation all instances are reclassified to the class value that is supported by the most rules.

This data generator can generate 'boolean' attributes (= nominal with the values {true, false}) and numeric attributes. The rules can be 'A' or 'NOT A' for boolean values and 'B < random_value' or 'B >= random_value' for numeric values.

Valid options are:

 -h
  Prints this help.
 
 -o <file>
  The name of the output file, otherwise the generated data is
  printed to stdout.
 
 -r <name>
  The name of the relation.
 
 -d
  Whether to print debug informations.
 
 -S
  The seed for random function (default 1)
 
 -n <num>
  The number of examples to generate (default 100)
 
 -a <num>
  The number of attributes (default 10).
 
 -c <num>
  The number of classes (default 2)
 
 -R <num>
  maximum size for rules (default 10)
 
 -M <num>
  minimum size for rules (default 1)
 
 -I <num>
  number of irrelevant attributes (default 0)
 
 -N
  number of numeric attributes (default 0)
 
 -V
  switch on voting (default is no voting)
 
Following an example of a generated dataset:
 %
 % weka.datagenerators.RDG1 -r expl -a 2 -c 3 -n 4 -N 1 -I 0 -M 2 -R 10 -S 2
 %
 relation expl
 
 attribute a0 {false,true}
 attribute a1 numeric
 attribute class {c0,c1,c2}
 
 data
 
 true,0.496823,c0
 false,0.743158,c1
 false,0.408285,c1
 false,0.993687,c2
 %
 % Number of attributes chosen as irrelevant = 0
 %
 % DECISIONLIST (number of rules = 3):
 % RULE 0:   c0 := a1 < 0.986, a0
 % RULE 1:   c1 := a1 < 0.95, not(a0)
 % RULE 2:   c2 := not(a0), a1 >= 0.562
 
Version:
$Revision: 10203 $
Author:
Gabi Schmidberger (gabi@cs.waikato.ac.nz)
See Also:
  • Constructor Details

    • RDG1

      public RDG1()
      initializes the generator with default values
  • Method Details

    • globalInfo

      public String globalInfo()
      Returns a string describing this data generator.
      Returns:
      a description of the data generator suitable for displaying in the explorer/experimenter gui
    • listOptions

      public Enumeration<Option> listOptions()
      Returns an enumeration describing the available options.
      Specified by:
      listOptions in interface OptionHandler
      Overrides:
      listOptions in class ClassificationGenerator
      Returns:
      an enumeration of all the available options
    • setOptions

      public void setOptions(String[] options) throws Exception
      Parses a list of options for this object.

      Valid options are:

       -h
        Prints this help.
       
       -o <file>
        The name of the output file, otherwise the generated data is
        printed to stdout.
       
       -r <name>
        The name of the relation.
       
       -d
        Whether to print debug informations.
       
       -S
        The seed for random function (default 1)
       
       -n <num>
        The number of examples to generate (default 100)
       
       -a <num>
        The number of attributes (default 10).
       
       -c <num>
        The number of classes (default 2)
       
       -R <num>
        maximum size for rules (default 10)
       
       -M <num>
        minimum size for rules (default 1)
       
       -I <num>
        number of irrelevant attributes (default 0)
       
       -N
        number of numeric attributes (default 0)
       
       -V
        switch on voting (default is no voting)
       
      Specified by:
      setOptions in interface OptionHandler
      Overrides:
      setOptions in class ClassificationGenerator
      Parameters:
      options - the list of options as an array of strings
      Throws:
      Exception - if an option is not supported
    • getOptions

      public String[] getOptions()
      Gets the current settings of the datagenerator RDG1.
      Specified by:
      getOptions in interface OptionHandler
      Overrides:
      getOptions in class ClassificationGenerator
      Returns:
      an array of strings suitable for passing to setOptions
      See Also:
      • DataGenerator.removeBlacklist(String[])
    • setNumAttributes

      public void setNumAttributes(int numAttributes)
      Sets the number of attributes the dataset should have.
      Parameters:
      numAttributes - the new number of attributes
    • getNumAttributes

      public int getNumAttributes()
      Gets the number of attributes that should be produced.
      Returns:
      the number of attributes that should be produced
    • numAttributesTipText

      public String numAttributesTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setNumClasses

      public void setNumClasses(int numClasses)
      Sets the number of classes the dataset should have.
      Parameters:
      numClasses - the new number of classes
    • getNumClasses

      public int getNumClasses()
      Gets the number of classes the dataset should have.
      Returns:
      the number of classes the dataset should have
    • numClassesTipText

      public String numClassesTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getMaxRuleSize

      public int getMaxRuleSize()
      Gets the maximum number of tests in rules.
      Returns:
      the maximum number of tests allowed in rules
    • setMaxRuleSize

      public void setMaxRuleSize(int newMaxRuleSize)
      Sets the maximum number of tests in rules.
      Parameters:
      newMaxRuleSize - new maximum number of tests allowed in rules.
    • maxRuleSizeTipText

      public String maxRuleSizeTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getMinRuleSize

      public int getMinRuleSize()
      Gets the minimum number of tests in rules.
      Returns:
      the minimum number of tests allowed in rules
    • setMinRuleSize

      public void setMinRuleSize(int newMinRuleSize)
      Sets the minimum number of tests in rules.
      Parameters:
      newMinRuleSize - new minimum number of test in rules.
    • minRuleSizeTipText

      public String minRuleSizeTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getNumIrrelevant

      public int getNumIrrelevant()
      Gets the number of irrelevant attributes.
      Returns:
      the number of irrelevant attributes
    • setNumIrrelevant

      public void setNumIrrelevant(int newNumIrrelevant)
      Sets the number of irrelevant attributes.
      Parameters:
      newNumIrrelevant - the number of irrelevant attributes.
    • numIrrelevantTipText

      public String numIrrelevantTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getNumNumeric

      public int getNumNumeric()
      Gets the number of numerical attributes.
      Returns:
      the number of numerical attributes.
    • setNumNumeric

      public void setNumNumeric(int newNumNumeric)
      Sets the number of numerical attributes.
      Parameters:
      newNumNumeric - the number of numerical attributes.
    • numNumericTipText

      public String numNumericTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getVoteFlag

      public boolean getVoteFlag()
      Gets the vote flag.
      Returns:
      voting flag.
    • setVoteFlag

      public void setVoteFlag(boolean newVoteFlag)
      Sets the vote flag.
      Parameters:
      newVoteFlag - boolean with the new setting of the vote flag.
    • voteFlagTipText

      public String voteFlagTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getSingleModeFlag

      public boolean getSingleModeFlag()
      Gets the single mode flag.
      Specified by:
      getSingleModeFlag in class DataGenerator
      Returns:
      true if methode generateExample can be used.
    • getAttList_Irr

      public boolean[] getAttList_Irr()
      Gets the array that defines which of the attributes are seen to be irrelevant.
      Returns:
      the array that defines the irrelevant attributes
    • setAttList_Irr

      public void setAttList_Irr(boolean[] newAttList_Irr)
      Sets the array that defines which of the attributes are seen to be irrelevant.
      Parameters:
      newAttList_Irr - array that defines the irrelevant attributes.
    • attList_IrrTipText

      public String attList_IrrTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • defineDataFormat

      public Instances defineDataFormat() throws Exception
      Initializes the format for the dataset produced.
      Overrides:
      defineDataFormat in class DataGenerator
      Returns:
      the output data format
      Throws:
      Exception - data format could not be defined
      See Also:
      • DataGenerator.defaultRelationName()
    • generateExample

      public Instance generateExample() throws Exception
      Generate an example of the dataset dataset.
      Specified by:
      generateExample in class DataGenerator
      Returns:
      the instance generated
      Throws:
      Exception - if format not defined or generating
      examples one by one is not possible, because voting is chosen
    • generateExamples

      public Instances generateExamples() throws Exception
      Generate all examples of the dataset.
      Specified by:
      generateExamples in class DataGenerator
      Returns:
      the instance generated
      Throws:
      Exception - if format not defined or generating
      examples one by one is not possible, because voting is chosen
    • generateExamples

      public Instances generateExamples(int num, Random random, Instances format) throws Exception
      Generate all examples of the dataset.
      Parameters:
      num - the number of examples to generate
      random - the random number generator to use
      format - the dataset format
      Returns:
      the instance generated
      Throws:
      Exception - if format not defined or generating
      examples one by one is not possible, because voting is chosen
    • generateStart

      public String generateStart()
      Generates a comment string that documentates the data generator. By default this string is added at the beginning of the produced output as ARFF file type, next after the options.
      Specified by:
      generateStart in class DataGenerator
      Returns:
      string contains info about the generated rules
    • generateFinished

      public String generateFinished() throws Exception
      Compiles documentation about the data generation. This is the number of irrelevant attributes and the decisionlist with all rules. Considering that the decisionlist might get enhanced until the last instance is generated, this method should be called at the end of the data generation process.
      Specified by:
      generateFinished in class DataGenerator
      Returns:
      string with additional information about generated dataset
      Throws:
      Exception - no input structure has been defined
    • getRevision

      public String getRevision()
      Returns the revision string.
      Returns:
      the revision
    • main

      public static void main(String[] args)
      Main method for testing this class.
      Parameters:
      args - should contain arguments for the data producer: