Package weka.core

Class TestInstances

java.lang.Object
weka.core.TestInstances
All Implemented Interfaces:
Serializable, Cloneable, OptionHandler, RevisionHandler

public class TestInstances extends Object implements Cloneable, Serializable, OptionHandler, RevisionHandler
Generates artificial datasets for testing. In case of Multi-Instance data the settings for the number of attributes applies to the data inside the bag. Originally based on code from the CheckClassifier.

Valid options are:

 -relation <name>
  The name of the data set.
 
 -seed <num>
  The seed value.
 
 -num-instances <num>
  The number of instances in the datasets (default 20).
 
 -class-type <num>
  The class type, see constants in weka.core.Attribute
  (default 1=nominal).
 
 -class-values <num>
  The number of classes to generate (for nominal classes only)
  (default 2).
 
 -class-index <num>
  The class index, with -1=last, (default -1).
 
 -no-class
  Doesn't include a class attribute in the output.
 
 -nominal <num>
  The number of nominal attributes (default 1).
 
 -nominal-values <num>
  The number of values for nominal attributes (default 2).
 
 -numeric <num>
  The number of numeric attributes (default 0).
 
 -string <num>
  The number of string attributes (default 0).
 
 -words <comma-separated-list>
  The words to use in string attributes.
 
 -word-separators <chars>
  The word separators to use in string attributes.
 
 -date <num>
  The number of date attributes (default 0).
 
 -relational <num>
  The number of relational attributes (default 0).
 
 -relational-nominal <num>
  The number of nominal attributes in a rel. attribute (default 1).
 
 -relational-nominal-values <num>
  The number of values for nominal attributes in a rel. attribute (default 2).
 
 -relational-numeric <num>
  The number of numeric attributes in a rel. attribute (default 0).
 
 -relational-string <num>
  The number of string attributes in a rel. attribute (default 0).
 
 -relational-date <num>
  The number of date attributes in a rel. attribute (default 0).
 
 -num-instances-relational <num>
  The number of instances in relational/bag attributes (default 10).
 
 -multi-instance
  Generates multi-instance data.
 
 -W <classname>
  The Capabilities handler to base the dataset on.
  The other parameters can be used to override the ones
  determined from the handler. Additional parameters for
  handler can be passed on after the '--'.
 
Version:
$Revision: 14293 $
Author:
FracPete (fracpete at waikato dot ac dot nz)
See Also:
  • Field Details

  • Constructor Details

    • TestInstances

      public TestInstances()
      the default constructor
  • Method Details

    • clone

      public Object clone()
      creates a clone of the current object
      Returns:
      a clone of the current object
    • assign

      public void assign(TestInstances t)
      updates itself with all the settings from the given TestInstances object
      Parameters:
      t - the object to get the settings from
    • listOptions

      public Enumeration<Option> listOptions()
      Returns an enumeration describing the available options.
      Specified by:
      listOptions in interface OptionHandler
      Returns:
      an enumeration of all the available options.
    • setOptions

      public void setOptions(String[] options) throws Exception
      Parses a given list of options.

      Valid options are:

       -relation <name>
        The name of the data set.
       
       -seed <num>
        The seed value.
       
       -num-instances <num>
        The number of instances in the datasets (default 20).
       
       -class-type <num>
        The class type, see constants in weka.core.Attribute
        (default 1=nominal).
       
       -class-values <num>
        The number of classes to generate (for nominal classes only)
        (default 2).
       
       -class-index <num>
        The class index, with -1=last, (default -1).
       
       -no-class
        Doesn't include a class attribute in the output.
       
       -nominal <num>
        The number of nominal attributes (default 1).
       
       -nominal-values <num>
        The number of values for nominal attributes (default 2).
       
       -numeric <num>
        The number of numeric attributes (default 0).
       
       -string <num>
        The number of string attributes (default 0).
       
       -words <comma-separated-list>
        The words to use in string attributes.
       
       -word-separators <chars>
        The word separators to use in string attributes.
       
       -date <num>
        The number of date attributes (default 0).
       
       -relational <num>
        The number of relational attributes (default 0).
       
       -relational-nominal <num>
        The number of nominal attributes in a rel. attribute (default 1).
       
       -relational-nominal-values <num>
        The number of values for nominal attributes in a rel. attribute (default 2).
       
       -relational-numeric <num>
        The number of numeric attributes in a rel. attribute (default 0).
       
       -relational-string <num>
        The number of string attributes in a rel. attribute (default 0).
       
       -relational-date <num>
        The number of date attributes in a rel. attribute (default 0).
       
       -num-instances-relational <num>
        The number of instances in relational/bag attributes (default 10).
       
       -multi-instance
        Generates multi-instance data.
       
       -W <classname>
        The Capabilities handler to base the dataset on.
        The other parameters can be used to override the ones
        determined from the handler. Additional parameters for
        handler can be passed on after the '--'.
       
      Specified by:
      setOptions in interface OptionHandler
      Parameters:
      options - the list of options as an array of strings
      Throws:
      Exception - if an option is not supported
    • getOptions

      public String[] getOptions()
      Gets the current settings of this object.
      Specified by:
      getOptions in interface OptionHandler
      Returns:
      an array of strings suitable for passing to setOptions
    • setRelation

      public void setRelation(String value)
      sets the name of the relation
      Parameters:
      value - the name of the relation
    • getRelation

      public String getRelation()
      returns the current name of the relation
      Returns:
      the name of the relation
    • setSeed

      public void setSeed(int value)
      sets the seed value for the random number generator
      Parameters:
      value - the seed
    • getSeed

      public int getSeed()
      returns the current seed value
      Returns:
      the seed value
    • setNumInstances

      public void setNumInstances(int value)
      sets the number of instances to produce
      Parameters:
      value - the number of instances
    • getNumInstances

      public int getNumInstances()
      returns the current number of instances to produce
      Returns:
      the number of instances
    • setClassType

      public void setClassType(int value)
      sets the class attribute type
      Parameters:
      value - the class attribute type
    • getClassType

      public int getClassType()
      returns the current class type
      Returns:
      the class attribute type
    • setNumClasses

      public void setNumClasses(int value)
      sets the number of classes
      Parameters:
      value - the number of classes
    • getNumClasses

      public int getNumClasses()
      returns the current number of classes
      Returns:
      the number of classes
    • setClassIndex

      public void setClassIndex(int value)
      sets the class index (0-based)
      Parameters:
      value - the class index
      See Also:
    • getClassIndex

      public int getClassIndex()
      returns the current class index (0-based), -1 is last attribute
      Returns:
      the class index
      See Also:
    • setNoClass

      public void setNoClass(boolean value)
      whether to have no class, e.g., for clusterers; otherwise the class attribute index is set to last
      Parameters:
      value - whether to have no class
      See Also:
    • getNoClass

      public boolean getNoClass()
      whether no class attribute is generated
      Returns:
      true if no class attribute is generated
    • setNumNominal

      public void setNumNominal(int value)
      sets the number of nominal attributes
      Parameters:
      value - the number of nominal attributes
    • getNumNominal

      public int getNumNominal()
      returns the current number of nominal attributes
      Returns:
      the number of nominal attributes
    • setNumNominalValues

      public void setNumNominalValues(int value)
      sets the number of values for nominal attributes
      Parameters:
      value - the number of values
    • getNumNominalValues

      public int getNumNominalValues()
      returns the current number of values for nominal attributes
      Returns:
      the number of values
    • setNumNumeric

      public void setNumNumeric(int value)
      sets the number of numeric attributes
      Parameters:
      value - the number of numeric attributes
    • getNumNumeric

      public int getNumNumeric()
      returns the current number of numeric attributes
      Returns:
      the number of numeric attributes
    • setNumString

      public void setNumString(int value)
      sets the number of string attributes
      Parameters:
      value - the number of string attributes
    • getNumString

      public int getNumString()
      returns the current number of string attributes
      Returns:
      the number of string attributes
    • setWords

      public void setWords(String value)
      Sets the comma-separated list of words to use for generating strings. The list must contain at least 2 words, otherwise an exception will be thrown.
      Parameters:
      value - the list of words
      Throws:
      IllegalArgumentException - if not at least 2 words are provided
    • getWords

      public String getWords()
      returns the words used for assembling strings in a comma-separated list.
      Returns:
      the words as comma-separated list
    • setWordSeparators

      public void setWordSeparators(String value)
      sets the word separators (chars) to use for assembling strings.
      Parameters:
      value - the characters to use as separators
    • getWordSeparators

      public String getWordSeparators()
      returns the word separators (chars) to use for assembling strings.
      Returns:
      the current separators
    • setNumDate

      public void setNumDate(int value)
      sets the number of date attributes
      Parameters:
      value - the number of date attributes
    • getNumDate

      public int getNumDate()
      returns the current number of date attributes
      Returns:
      the number of date attributes
    • setNumRelational

      public void setNumRelational(int value)
      sets the number of relational attributes
      Parameters:
      value - the number of relational attributes
    • getNumRelational

      public int getNumRelational()
      returns the current number of relational attributes
      Returns:
      the number of relational attributes
    • setNumRelationalNominal

      public void setNumRelationalNominal(int value)
      sets the number of nominal attributes in a relational attribute
      Parameters:
      value - the number of nominal attributes
    • getNumRelationalNominal

      public int getNumRelationalNominal()
      returns the current number of nominal attributes in a relational attribute
      Returns:
      the number of nominal attributes
    • setNumRelationalNominalValues

      public void setNumRelationalNominalValues(int value)
      sets the number of values for nominal attributes in a relational attribute
      Parameters:
      value - the number of values
    • getNumRelationalNominalValues

      public int getNumRelationalNominalValues()
      returns the current number of values for nominal attributes in a relational attribute
      Returns:
      the number of values
    • setNumRelationalNumeric

      public void setNumRelationalNumeric(int value)
      sets the number of numeric attributes in a relational attribute
      Parameters:
      value - the number of numeric attributes
    • getNumRelationalNumeric

      public int getNumRelationalNumeric()
      returns the current number of numeric attributes in a relational attribute
      Returns:
      the number of numeric attributes
    • setNumRelationalString

      public void setNumRelationalString(int value)
      sets the number of string attributes in a relational attribute
      Parameters:
      value - the number of string attributes
    • getNumRelationalString

      public int getNumRelationalString()
      returns the current number of string attributes in a relational attribute
      Returns:
      the number of string attributes
    • setNumRelationalDate

      public void setNumRelationalDate(int value)
      sets the number of date attributes in a relational attribute
      Parameters:
      value - the number of date attributes
    • getNumRelationalDate

      public int getNumRelationalDate()
      returns the current number of date attributes in a relational attribute
      Returns:
      the number of date attributes
    • setNumInstancesRelational

      public void setNumInstancesRelational(int value)
      sets the number of instances in relational/bag attributes to produce
      Parameters:
      value - the number of instances
    • getNumInstancesRelational

      public int getNumInstancesRelational()
      returns the current number of instances in relational/bag attributes to produce
      Returns:
      the number of instances
    • setMultiInstance

      public void setMultiInstance(boolean value)
      sets whether multi-instance data should be generated (with a fixed data structure)
      Parameters:
      value - whether multi-instance data is generated
    • getMultiInstance

      public boolean getMultiInstance()
      Gets whether multi-instance data (with a fixed structure) is generated
      Returns:
      true if multi-instance data is generated
    • setRelationalFormat

      public void setRelationalFormat(int index, Instances value)
      sets the structure for the bags for the relational attribute
      Parameters:
      index - the index of the relational attribute
      value - the new structure
    • getRelationalFormat

      public Instances getRelationalFormat(int index)
      returns the format for the specified relational attribute, can be null
      Parameters:
      index - the index of the relational attribute
      Returns:
      the current structure
    • setRelationalClassFormat

      public void setRelationalClassFormat(Instances value)
      sets the structure for the relational class attribute
      Parameters:
      value - the structure for the relational attribute
    • getRelationalClassFormat

      public Instances getRelationalClassFormat()
      returns the current strcuture of the relational class attribute, can be null
      Returns:
      the relational structure of the class attribute
    • getNumAttributes

      public int getNumAttributes()
      returns the overall number of attributes (incl. class, if that is also generated)
      Returns:
      the overall number of attributes
    • getData

      public Instances getData()
      returns the current dataset, can be null
      Returns:
      the current dataset
    • setHandler

      public void setHandler(CapabilitiesHandler value)
      sets the Capabilities handler to generate the data for
      Parameters:
      value - the handler to generate the data for
    • getHandler

      public CapabilitiesHandler getHandler()
      returns the current set CapabilitiesHandler to generate the dataset for, can be null
      Returns:
      the handler to generate the data for
    • generate

      public Instances generate() throws Exception
      Generates a new dataset
      Returns:
      the generated data
      Throws:
      Exception - if something goes wrong
    • generate

      public Instances generate(String namePrefix) throws Exception
      generates a new dataset.
      Parameters:
      namePrefix - the prefix to add to the name of an attribute
      Returns:
      the generated data
      Throws:
      Exception - if something goes wrong
    • forCapabilities

      public static TestInstances forCapabilities(Capabilities c)
      returns a TestInstances instance setup already for the the given capabilities.
      Parameters:
      c - the capabilities to base the TestInstances on
      Returns:
      the configured TestInstances object
    • toString

      public String toString()
      returns a string representation of the object
      Overrides:
      toString in class Object
      Returns:
      a string representation of the object
    • getRevision

      public String getRevision()
      Returns the revision string.
      Specified by:
      getRevision in interface RevisionHandler
      Returns:
      the revision
    • main

      public static void main(String[] args) throws Exception
      for running the class from commandline, prints the generated data to stdout
      Parameters:
      args - the commandline parameters
      Throws:
      Exception - if something goes wrong