All Implemented Interfaces:
Serializable, OptionHandler, Randomizable, RevisionHandler

public class RandomRBF extends ClassificationGenerator
RandomRBF data is generated by first creating a random set of centers for each class. Each center is randomly assigned a weight, a central point per attribute, and a standard deviation. To generate new instances, a center is chosen at random taking the weights of each center into consideration. Attribute values are randomly generated and offset from the center, where the overall vector has been scaled so that its length equals a value sampled randomly from the Gaussian distribution of the center. The particular center chosen determines the class of the instance.
RandomRBF data contains only numeric attributes as it is non-trivial to include nominal values.

Valid options are:

 -h
  Prints this help.
 
 -o <file>
  The name of the output file, otherwise the generated data is
  printed to stdout.
 
 -r <name>
  The name of the relation.
 
 -d
  Whether to print debug informations.
 
 -S
  The seed for random function (default 1)
 
 -n <num>
  The number of examples to generate (default 100)
 
 -a <num>
  The number of attributes (default 10).
 
 -c <num>
  The number of classes (default 2)
 
 -C <num>
  The number of centroids to use. (default 50)
 
Version:
$Revision: 10203 $
Author:
Richard Kirkby (rkirkby at cs dot waikato dot ac dot nz), FracPete (fracpete at waikato dot ac dot nz)
See Also:
  • Constructor Details

    • RandomRBF

      public RandomRBF()
      initializes the generator with default values
  • Method Details

    • globalInfo

      public String globalInfo()
      Returns a string describing this data generator.
      Returns:
      a description of the data generator suitable for displaying in the explorer/experimenter gui
    • listOptions

      public Enumeration<Option> listOptions()
      Returns an enumeration describing the available options.
      Specified by:
      listOptions in interface OptionHandler
      Overrides:
      listOptions in class ClassificationGenerator
      Returns:
      an enumeration of all the available options
    • setOptions

      public void setOptions(String[] options) throws Exception
      Parses a list of options for this object.

      Valid options are:

       -h
        Prints this help.
       
       -o <file>
        The name of the output file, otherwise the generated data is
        printed to stdout.
       
       -r <name>
        The name of the relation.
       
       -d
        Whether to print debug informations.
       
       -S
        The seed for random function (default 1)
       
       -n <num>
        The number of examples to generate (default 100)
       
       -a <num>
        The number of attributes (default 10).
       
       -c <num>
        The number of classes (default 2)
       
       -C <num>
        The number of centroids to use. (default 50)
       
      Specified by:
      setOptions in interface OptionHandler
      Overrides:
      setOptions in class ClassificationGenerator
      Parameters:
      options - the list of options as an array of strings
      Throws:
      Exception - if an option is not supported
    • getOptions

      public String[] getOptions()
      Gets the current settings of the datagenerator.
      Specified by:
      getOptions in interface OptionHandler
      Overrides:
      getOptions in class ClassificationGenerator
      Returns:
      an array of strings suitable for passing to setOptions
      See Also:
      • DataGenerator.removeBlacklist(String[])
    • setNumAttributes

      public void setNumAttributes(int numAttributes)
      Sets the number of attributes the dataset should have.
      Parameters:
      numAttributes - the new number of attributes
    • getNumAttributes

      public int getNumAttributes()
      Gets the number of attributes that should be produced.
      Returns:
      the number of attributes that should be produced
    • numAttributesTipText

      public String numAttributesTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setNumClasses

      public void setNumClasses(int numClasses)
      Sets the number of classes the dataset should have.
      Parameters:
      numClasses - the new number of classes
    • getNumClasses

      public int getNumClasses()
      Gets the number of classes the dataset should have.
      Returns:
      the number of classes the dataset should have
    • numClassesTipText

      public String numClassesTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getNumCentroids

      public int getNumCentroids()
      Gets the number of centroids.
      Returns:
      the number of centroids.
    • setNumCentroids

      public void setNumCentroids(int value)
      Sets the number of centroids to use.
      Parameters:
      value - the number of centroids to use.
    • numCentroidsTipText

      public String numCentroidsTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getSingleModeFlag

      public boolean getSingleModeFlag() throws Exception
      Return if single mode is set for the given data generator mode depends on option setting and or generator type.
      Specified by:
      getSingleModeFlag in class DataGenerator
      Returns:
      single mode flag
      Throws:
      Exception - if mode is not set yet
    • defineDataFormat

      public Instances defineDataFormat() throws Exception
      Initializes the format for the dataset produced. Must be called before the generateExample or generateExamples methods are used. Re-initializes the random number generator with the given seed.
      Overrides:
      defineDataFormat in class DataGenerator
      Returns:
      the format for the dataset
      Throws:
      Exception - if the generating of the format failed
      See Also:
    • generateExample

      public Instance generateExample() throws Exception
      Generates one example of the dataset.
      Specified by:
      generateExample in class DataGenerator
      Returns:
      the generated example
      Throws:
      Exception - if the format of the dataset is not yet defined
      Exception - if the generator only works with generateExamples which means in non single mode
    • generateExamples

      public Instances generateExamples() throws Exception
      Generates all examples of the dataset. Re-initializes the random number generator with the given seed, before generating instances.
      Specified by:
      generateExamples in class DataGenerator
      Returns:
      the generated dataset
      Throws:
      Exception - if the format of the dataset is not yet defined
      Exception - if the generator only works with generateExample, which means in single mode
      See Also:
    • generateStart

      public String generateStart()
      Generates a comment string that documentates the data generator. By default this string is added at the beginning of the produced output as ARFF file type, next after the options.
      Specified by:
      generateStart in class DataGenerator
      Returns:
      string contains info about the generated rules
    • generateFinished

      public String generateFinished() throws Exception
      Generates a comment string that documentats the data generator. By default this string is added at the end of theproduces output as ARFF file type.
      Specified by:
      generateFinished in class DataGenerator
      Returns:
      string contains info about the generated rules
      Throws:
      Exception - if the generating of the documentaion fails
    • getRevision

      public String getRevision()
      Returns the revision string.
      Returns:
      the revision
    • main

      public static void main(String[] args)
      Main method for executing this class.
      Parameters:
      args - should contain arguments for the data producer: