Package weka.core
Class TestInstances
java.lang.Object
weka.core.TestInstances
- All Implemented Interfaces:
Serializable
,Cloneable
,OptionHandler
,RevisionHandler
public class TestInstances
extends Object
implements Cloneable, Serializable, OptionHandler, RevisionHandler
Generates artificial datasets for testing. In case of Multi-Instance data the
settings for the number of attributes applies to the data inside the bag.
Originally based on code from the CheckClassifier.
Valid options are:
-relation <name> The name of the data set.
-seed <num> The seed value.
-num-instances <num> The number of instances in the datasets (default 20).
-class-type <num> The class type, see constants in weka.core.Attribute (default 1=nominal).
-class-values <num> The number of classes to generate (for nominal classes only) (default 2).
-class-index <num> The class index, with -1=last, (default -1).
-no-class Doesn't include a class attribute in the output.
-nominal <num> The number of nominal attributes (default 1).
-nominal-values <num> The number of values for nominal attributes (default 2).
-numeric <num> The number of numeric attributes (default 0).
-string <num> The number of string attributes (default 0).
-words <comma-separated-list> The words to use in string attributes.
-word-separators <chars> The word separators to use in string attributes.
-date <num> The number of date attributes (default 0).
-relational <num> The number of relational attributes (default 0).
-relational-nominal <num> The number of nominal attributes in a rel. attribute (default 1).
-relational-nominal-values <num> The number of values for nominal attributes in a rel. attribute (default 2).
-relational-numeric <num> The number of numeric attributes in a rel. attribute (default 0).
-relational-string <num> The number of string attributes in a rel. attribute (default 0).
-relational-date <num> The number of date attributes in a rel. attribute (default 0).
-num-instances-relational <num> The number of instances in relational/bag attributes (default 10).
-multi-instance Generates multi-instance data.
-W <classname> The Capabilities handler to base the dataset on. The other parameters can be used to override the ones determined from the handler. Additional parameters for handler can be passed on after the '--'.
- Version:
- $Revision: 14293 $
- Author:
- FracPete (fracpete at waikato dot ac dot nz)
- See Also:
-
Field Summary
Modifier and TypeFieldDescriptionstatic final int
can be used for settting the class attribute index to laststatic final String
the default word separators used in stringsstatic final String[]
the default list of words used in stringsstatic final int
can be used to avoid generating a class attribute -
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionvoid
updates itself with all the settings from the given TestInstances objectclone()
creates a clone of the current objectstatic TestInstances
returns a TestInstances instance setup already for the the given capabilities.generate()
Generates a new datasetgenerates a new dataset.int
returns the current class index (0-based), -1 is last attributeint
returns the current class typegetData()
returns the current dataset, can be nullreturns the current set CapabilitiesHandler to generate the dataset for, can be nullboolean
Gets whether multi-instance data (with a fixed structure) is generatedboolean
whether no class attribute is generatedint
returns the overall number of attributes (incl.int
returns the current number of classesint
returns the current number of date attributesint
returns the current number of instances to produceint
returns the current number of instances in relational/bag attributes to produceint
returns the current number of nominal attributesint
returns the current number of values for nominal attributesint
returns the current number of numeric attributesint
returns the current number of relational attributesint
returns the current number of date attributes in a relational attributeint
returns the current number of nominal attributes in a relational attributeint
returns the current number of values for nominal attributes in a relational attributeint
returns the current number of numeric attributes in a relational attributeint
returns the current number of string attributes in a relational attributeint
returns the current number of string attributesString[]
Gets the current settings of this object.returns the current name of the relationreturns the current strcuture of the relational class attribute, can be nullgetRelationalFormat
(int index) returns the format for the specified relational attribute, can be nullReturns the revision string.int
getSeed()
returns the current seed valuegetWords()
returns the words used for assembling strings in a comma-separated list.returns the word separators (chars) to use for assembling strings.Returns an enumeration describing the available options.static void
for running the class from commandline, prints the generated data to stdoutvoid
setClassIndex
(int value) sets the class index (0-based)void
setClassType
(int value) sets the class attribute typevoid
setHandler
(CapabilitiesHandler value) sets the Capabilities handler to generate the data forvoid
setMultiInstance
(boolean value) sets whether multi-instance data should be generated (with a fixed data structure)void
setNoClass
(boolean value) whether to have no class, e.g., for clusterers; otherwise the class attribute index is set to lastvoid
setNumClasses
(int value) sets the number of classesvoid
setNumDate
(int value) sets the number of date attributesvoid
setNumInstances
(int value) sets the number of instances to producevoid
setNumInstancesRelational
(int value) sets the number of instances in relational/bag attributes to producevoid
setNumNominal
(int value) sets the number of nominal attributesvoid
setNumNominalValues
(int value) sets the number of values for nominal attributesvoid
setNumNumeric
(int value) sets the number of numeric attributesvoid
setNumRelational
(int value) sets the number of relational attributesvoid
setNumRelationalDate
(int value) sets the number of date attributes in a relational attributevoid
setNumRelationalNominal
(int value) sets the number of nominal attributes in a relational attributevoid
setNumRelationalNominalValues
(int value) sets the number of values for nominal attributes in a relational attributevoid
setNumRelationalNumeric
(int value) sets the number of numeric attributes in a relational attributevoid
setNumRelationalString
(int value) sets the number of string attributes in a relational attributevoid
setNumString
(int value) sets the number of string attributesvoid
setOptions
(String[] options) Parses a given list of options.void
setRelation
(String value) sets the name of the relationvoid
sets the structure for the relational class attributevoid
setRelationalFormat
(int index, Instances value) sets the structure for the bags for the relational attributevoid
setSeed
(int value) sets the seed value for the random number generatorvoid
Sets the comma-separated list of words to use for generating strings.void
setWordSeparators
(String value) sets the word separators (chars) to use for assembling strings.toString()
returns a string representation of the object
-
Field Details
-
CLASS_IS_LAST
public static final int CLASS_IS_LASTcan be used for settting the class attribute index to last- See Also:
-
NO_CLASS
public static final int NO_CLASScan be used to avoid generating a class attribute- See Also:
-
DEFAULT_WORDS
the default list of words used in strings -
DEFAULT_SEPARATORS
the default word separators used in strings- See Also:
-
-
Constructor Details
-
TestInstances
public TestInstances()the default constructor
-
-
Method Details
-
clone
creates a clone of the current object- Returns:
- a clone of the current object
-
assign
updates itself with all the settings from the given TestInstances object- Parameters:
t
- the object to get the settings from
-
listOptions
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceOptionHandler
- Returns:
- an enumeration of all the available options.
-
setOptions
Parses a given list of options. Valid options are:-relation <name> The name of the data set.
-seed <num> The seed value.
-num-instances <num> The number of instances in the datasets (default 20).
-class-type <num> The class type, see constants in weka.core.Attribute (default 1=nominal).
-class-values <num> The number of classes to generate (for nominal classes only) (default 2).
-class-index <num> The class index, with -1=last, (default -1).
-no-class Doesn't include a class attribute in the output.
-nominal <num> The number of nominal attributes (default 1).
-nominal-values <num> The number of values for nominal attributes (default 2).
-numeric <num> The number of numeric attributes (default 0).
-string <num> The number of string attributes (default 0).
-words <comma-separated-list> The words to use in string attributes.
-word-separators <chars> The word separators to use in string attributes.
-date <num> The number of date attributes (default 0).
-relational <num> The number of relational attributes (default 0).
-relational-nominal <num> The number of nominal attributes in a rel. attribute (default 1).
-relational-nominal-values <num> The number of values for nominal attributes in a rel. attribute (default 2).
-relational-numeric <num> The number of numeric attributes in a rel. attribute (default 0).
-relational-string <num> The number of string attributes in a rel. attribute (default 0).
-relational-date <num> The number of date attributes in a rel. attribute (default 0).
-num-instances-relational <num> The number of instances in relational/bag attributes (default 10).
-multi-instance Generates multi-instance data.
-W <classname> The Capabilities handler to base the dataset on. The other parameters can be used to override the ones determined from the handler. Additional parameters for handler can be passed on after the '--'.
- Specified by:
setOptions
in interfaceOptionHandler
- Parameters:
options
- the list of options as an array of strings- Throws:
Exception
- if an option is not supported
-
getOptions
Gets the current settings of this object.- Specified by:
getOptions
in interfaceOptionHandler
- Returns:
- an array of strings suitable for passing to setOptions
-
setRelation
sets the name of the relation- Parameters:
value
- the name of the relation
-
getRelation
returns the current name of the relation- Returns:
- the name of the relation
-
setSeed
public void setSeed(int value) sets the seed value for the random number generator- Parameters:
value
- the seed
-
getSeed
public int getSeed()returns the current seed value- Returns:
- the seed value
-
setNumInstances
public void setNumInstances(int value) sets the number of instances to produce- Parameters:
value
- the number of instances
-
getNumInstances
public int getNumInstances()returns the current number of instances to produce- Returns:
- the number of instances
-
setClassType
public void setClassType(int value) sets the class attribute type- Parameters:
value
- the class attribute type
-
getClassType
public int getClassType()returns the current class type- Returns:
- the class attribute type
-
setNumClasses
public void setNumClasses(int value) sets the number of classes- Parameters:
value
- the number of classes
-
getNumClasses
public int getNumClasses()returns the current number of classes- Returns:
- the number of classes
-
setClassIndex
public void setClassIndex(int value) sets the class index (0-based)- Parameters:
value
- the class index- See Also:
-
getClassIndex
public int getClassIndex()returns the current class index (0-based), -1 is last attribute- Returns:
- the class index
- See Also:
-
setNoClass
public void setNoClass(boolean value) whether to have no class, e.g., for clusterers; otherwise the class attribute index is set to last- Parameters:
value
- whether to have no class- See Also:
-
getNoClass
public boolean getNoClass()whether no class attribute is generated- Returns:
- true if no class attribute is generated
-
setNumNominal
public void setNumNominal(int value) sets the number of nominal attributes- Parameters:
value
- the number of nominal attributes
-
getNumNominal
public int getNumNominal()returns the current number of nominal attributes- Returns:
- the number of nominal attributes
-
setNumNominalValues
public void setNumNominalValues(int value) sets the number of values for nominal attributes- Parameters:
value
- the number of values
-
getNumNominalValues
public int getNumNominalValues()returns the current number of values for nominal attributes- Returns:
- the number of values
-
setNumNumeric
public void setNumNumeric(int value) sets the number of numeric attributes- Parameters:
value
- the number of numeric attributes
-
getNumNumeric
public int getNumNumeric()returns the current number of numeric attributes- Returns:
- the number of numeric attributes
-
setNumString
public void setNumString(int value) sets the number of string attributes- Parameters:
value
- the number of string attributes
-
getNumString
public int getNumString()returns the current number of string attributes- Returns:
- the number of string attributes
-
setWords
Sets the comma-separated list of words to use for generating strings. The list must contain at least 2 words, otherwise an exception will be thrown.- Parameters:
value
- the list of words- Throws:
IllegalArgumentException
- if not at least 2 words are provided
-
getWords
returns the words used for assembling strings in a comma-separated list.- Returns:
- the words as comma-separated list
-
setWordSeparators
sets the word separators (chars) to use for assembling strings.- Parameters:
value
- the characters to use as separators
-
getWordSeparators
returns the word separators (chars) to use for assembling strings.- Returns:
- the current separators
-
setNumDate
public void setNumDate(int value) sets the number of date attributes- Parameters:
value
- the number of date attributes
-
getNumDate
public int getNumDate()returns the current number of date attributes- Returns:
- the number of date attributes
-
setNumRelational
public void setNumRelational(int value) sets the number of relational attributes- Parameters:
value
- the number of relational attributes
-
getNumRelational
public int getNumRelational()returns the current number of relational attributes- Returns:
- the number of relational attributes
-
setNumRelationalNominal
public void setNumRelationalNominal(int value) sets the number of nominal attributes in a relational attribute- Parameters:
value
- the number of nominal attributes
-
getNumRelationalNominal
public int getNumRelationalNominal()returns the current number of nominal attributes in a relational attribute- Returns:
- the number of nominal attributes
-
setNumRelationalNominalValues
public void setNumRelationalNominalValues(int value) sets the number of values for nominal attributes in a relational attribute- Parameters:
value
- the number of values
-
getNumRelationalNominalValues
public int getNumRelationalNominalValues()returns the current number of values for nominal attributes in a relational attribute- Returns:
- the number of values
-
setNumRelationalNumeric
public void setNumRelationalNumeric(int value) sets the number of numeric attributes in a relational attribute- Parameters:
value
- the number of numeric attributes
-
getNumRelationalNumeric
public int getNumRelationalNumeric()returns the current number of numeric attributes in a relational attribute- Returns:
- the number of numeric attributes
-
setNumRelationalString
public void setNumRelationalString(int value) sets the number of string attributes in a relational attribute- Parameters:
value
- the number of string attributes
-
getNumRelationalString
public int getNumRelationalString()returns the current number of string attributes in a relational attribute- Returns:
- the number of string attributes
-
setNumRelationalDate
public void setNumRelationalDate(int value) sets the number of date attributes in a relational attribute- Parameters:
value
- the number of date attributes
-
getNumRelationalDate
public int getNumRelationalDate()returns the current number of date attributes in a relational attribute- Returns:
- the number of date attributes
-
setNumInstancesRelational
public void setNumInstancesRelational(int value) sets the number of instances in relational/bag attributes to produce- Parameters:
value
- the number of instances
-
getNumInstancesRelational
public int getNumInstancesRelational()returns the current number of instances in relational/bag attributes to produce- Returns:
- the number of instances
-
setMultiInstance
public void setMultiInstance(boolean value) sets whether multi-instance data should be generated (with a fixed data structure)- Parameters:
value
- whether multi-instance data is generated
-
getMultiInstance
public boolean getMultiInstance()Gets whether multi-instance data (with a fixed structure) is generated- Returns:
- true if multi-instance data is generated
-
setRelationalFormat
sets the structure for the bags for the relational attribute- Parameters:
index
- the index of the relational attributevalue
- the new structure
-
getRelationalFormat
returns the format for the specified relational attribute, can be null- Parameters:
index
- the index of the relational attribute- Returns:
- the current structure
-
setRelationalClassFormat
sets the structure for the relational class attribute- Parameters:
value
- the structure for the relational attribute
-
getRelationalClassFormat
returns the current strcuture of the relational class attribute, can be null- Returns:
- the relational structure of the class attribute
-
getNumAttributes
public int getNumAttributes()returns the overall number of attributes (incl. class, if that is also generated)- Returns:
- the overall number of attributes
-
getData
returns the current dataset, can be null- Returns:
- the current dataset
-
setHandler
sets the Capabilities handler to generate the data for- Parameters:
value
- the handler to generate the data for
-
getHandler
returns the current set CapabilitiesHandler to generate the dataset for, can be null- Returns:
- the handler to generate the data for
-
generate
Generates a new dataset- Returns:
- the generated data
- Throws:
Exception
- if something goes wrong
-
generate
generates a new dataset.- Parameters:
namePrefix
- the prefix to add to the name of an attribute- Returns:
- the generated data
- Throws:
Exception
- if something goes wrong
-
forCapabilities
returns a TestInstances instance setup already for the the given capabilities.- Parameters:
c
- the capabilities to base the TestInstances on- Returns:
- the configured TestInstances object
-
toString
returns a string representation of the object -
getRevision
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Returns:
- the revision
-
main
for running the class from commandline, prints the generated data to stdout- Parameters:
args
- the commandline parameters- Throws:
Exception
- if something goes wrong
-