Package weka.datagenerators
Class DataGenerator
java.lang.Object
weka.datagenerators.DataGenerator
- All Implemented Interfaces:
Serializable
,OptionHandler
,Randomizable
,RevisionHandler
- Direct Known Subclasses:
ClassificationGenerator
,ClusterGenerator
,RegressionGenerator
public abstract class DataGenerator
extends Object
implements OptionHandler, Randomizable, Serializable, RevisionHandler
Abstract superclass for data generators that generate data for classifiers
and clusterers.
- Version:
- $Revision: 15437 $
- Author:
- FracPete (fracpete at waikato dot ac dot nz)
- See Also:
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionReturns the tip text for this propertyGets writer, which is used for outputting to stdout.Constructs the Instances object representing the format of the generated data.enumToVector
(Enumeration<Option> enu) Convenience method.Returns the tip text for this propertyabstract Instance
Generates one example of the dataset.abstract Instances
Generates all examples of the dataset.abstract String
Generates a comment string that documentates the data generator.abstract String
Generates a comment string that documentates the data generator.Gets the format of the dataset that is to be generated.boolean
getDebug()
Gets the debug flag.Gets the epilogue string.int
Gets the number of examples the dataset should have.String[]
Gets the current settings of the datagenerator RDG1.Gets the print writer.Gets the prologue string.Gets the random generator.Gets the relation name the dataset should have.int
getSeed()
Gets the random number seed.abstract boolean
Return if single mode is set for the given data generator mode depends on option setting and or generator type.Returns an enumeration describing the available options.static void
makeData
(DataGenerator generator, String[] options) Calls the data generator.Returns the tip text for this propertyReturns the tip text for this propertyReturns the tip text for this propertystatic void
runDataGenerator
(DataGenerator datagenerator, String[] options) runs the datagenerator instance with the given options.Returns the tip text for this propertyvoid
setDatasetFormat
(Instances newFormat) Sets the format of the dataset that is to be generated.void
setDebug
(boolean debug) Sets the debug flag.void
setOptions
(String[] options) Parses a list of options for this object.void
setOutput
(PrintWriter newOutput) Sets the print writer.void
Sets the random generator.void
setRelationName
(String relationName) Sets the relation name the dataset should have.void
setSeed
(int newSeed) Sets the random number seed.Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface weka.core.RevisionHandler
getRevision
-
Constructor Details
-
DataGenerator
public DataGenerator()initializes with default settings.
Note: default values are set via a default<name> method. These default methods are also used in the listOptions method and in the setOptions method. Why? Derived generators can override the return value of these default methods, to avoid exceptions.
-
-
Method Details
-
listOptions
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceOptionHandler
- Returns:
- an enumeration of all the available options
-
enumToVector
Convenience method. Turns the given enumeration of options into a vector. -
setOptions
Parses a list of options for this object. For list of valid options see class description.- Specified by:
setOptions
in interfaceOptionHandler
- Parameters:
options
- the list of options as an array of strings- Throws:
Exception
- if an option is not supported
-
getOptions
Gets the current settings of the datagenerator RDG1. Removing of blacklisted options has to be done in the derived class, that defines the blacklist-entry.- Specified by:
getOptions
in interfaceOptionHandler
- Returns:
- an array of strings suitable for passing to setOptions
- See Also:
-
removeBlacklist(String[])
-
defineDataFormat
Constructs the Instances object representing the format of the generated data. This default implementation simply returns the Instances object that holds the dataset format currently stored in m_DatasetFormat.- Returns:
- the format for the dataset
- Throws:
Exception
- if the generating of the format failed- See Also:
-
defaultRelationName()
-
generateExample
Generates one example of the dataset. -
generateExamples
Generates all examples of the dataset. -
generateStart
Generates a comment string that documentates the data generator. By default this string is added at the beginning of the produced output as ARFF file type, next after the options.- Returns:
- string contains info about the generated rules
- Throws:
Exception
- if the generating of the documentation fails
-
generateFinished
Generates a comment string that documentates the data generator. By default this string is added at the end of the produced output as ARFF file type.- Returns:
- string contains info about the generated rules
- Throws:
Exception
- if the generating of the documentation fails
-
getSingleModeFlag
Return if single mode is set for the given data generator mode depends on option setting and or generator type.- Returns:
- single mode flag
- Throws:
Exception
- if mode is not set yet
-
setDebug
public void setDebug(boolean debug) Sets the debug flag.- Parameters:
debug
- the new debug flag
-
getDebug
public boolean getDebug()Gets the debug flag.- Returns:
- the debug flag
-
debugTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setRelationName
Sets the relation name the dataset should have.- Parameters:
relationName
- the new relation name
-
getRelationName
Gets the relation name the dataset should have.- Returns:
- the relation name the dataset should have
-
relationNameTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getNumExamplesAct
public int getNumExamplesAct()Gets the number of examples the dataset should have.- Returns:
- the number of examples the dataset should have
-
setOutput
Sets the print writer.- Parameters:
newOutput
- the new print writer
-
getOutput
Gets the print writer.- Returns:
- print writer object
-
defaultOutput
Gets writer, which is used for outputting to stdout. A workaround for the problem of closing stdout when closing the associated Printwriter.- Returns:
- writer object
-
outputTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setDatasetFormat
Sets the format of the dataset that is to be generated.- Parameters:
newFormat
- the new dataset format of the dataset
-
getDatasetFormat
Gets the format of the dataset that is to be generated.- Returns:
- the dataset format of the dataset
-
formatTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getSeed
public int getSeed()Gets the random number seed.- Specified by:
getSeed
in interfaceRandomizable
- Returns:
- the random number seed.
-
setSeed
public void setSeed(int newSeed) Sets the random number seed.- Specified by:
setSeed
in interfaceRandomizable
- Parameters:
newSeed
- the new random number seed.
-
seedTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getRandom
Gets the random generator.- Returns:
- the random generator
-
setRandom
Sets the random generator.- Parameters:
newRandom
- is the random generator.
-
randomTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getPrologue
Gets the prologue string.- Returns:
- prologue
- Throws:
Exception
-
getEpilogue
Gets the epilogue string.- Returns:
- epilogue
- Throws:
Exception
-
makeData
Calls the data generator.- Parameters:
generator
- one of the data generatorsoptions
- options of the data generator- Throws:
Exception
- if there was an error in the option list
-
runDataGenerator
runs the datagenerator instance with the given options.- Parameters:
datagenerator
- the datagenerator to runoptions
- the commandline options
-