Class Agrawal
java.lang.Object
weka.datagenerators.DataGenerator
weka.datagenerators.ClassificationGenerator
weka.datagenerators.classifiers.classification.Agrawal
- All Implemented Interfaces:
Serializable
,OptionHandler
,Randomizable
,RevisionHandler
,TechnicalInformationHandler
Generates a people database and is based on the
paper by Agrawal et al.:
R. Agrawal, T. Imielinski, A. Swami (1993). Database Mining: A Performance Perspective. IEEE Transactions on Knowledge and Data Engineering. 5(6):914-925. URL http://www.almaden.ibm.com/software/quest/Publications/ByDate.html. BibTeX:
R. Agrawal, T. Imielinski, A. Swami (1993). Database Mining: A Performance Perspective. IEEE Transactions on Knowledge and Data Engineering. 5(6):914-925. URL http://www.almaden.ibm.com/software/quest/Publications/ByDate.html. BibTeX:
@article{Agrawal1993, author = {R. Agrawal and T. Imielinski and A. Swami}, journal = {IEEE Transactions on Knowledge and Data Engineering}, note = {Special issue on Learning and Discovery in Knowledge-Based Databases}, number = {6}, pages = {914-925}, title = {Database Mining: A Performance Perspective}, volume = {5}, year = {1993}, URL = {http://www.almaden.ibm.com/software/quest/Publications/ByDate.html}, PDF = {http://www.almaden.ibm.com/software/quest/Publications/papers/tkde93.pdf} }Valid options are:
-h Prints this help.
-o <file> The name of the output file, otherwise the generated data is printed to stdout.
-r <name> The name of the relation.
-d Whether to print debug informations.
-S The seed for random function (default 1)
-n <num> The number of examples to generate (default 100)
-F <num> The function to use for generating the data. (default 1)
-B Whether to balance the class.
-P <num> The perturbation factor. (default 0.05)
- Version:
- $Revision: 10203 $
- Author:
- Richard Kirkby (rkirkby at cs dot waikato dot ac dot nz), FracPete (fracpete at waikato dot ac dot nz)
- See Also:
-
Field Summary
Modifier and TypeFieldDescriptionstatic final int
function 1static final int
function 10static final int
function 2static final int
function 3static final int
function 4static final int
function 5static final int
function 6static final int
function 7static final int
function 8static final int
function 9static final Tag[]
the funtion tags -
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionReturns the tip text for this propertyInitializes the format for the dataset produced.Returns the tip text for this propertyGenerates one example of the dataset.Generates all examples of the dataset.Generates a comment string that documentats the data generator.Generates a comment string that documentates the data generator.boolean
Gets whether the class is balanced.Gets the function for generating the data.String[]
Gets the current settings of the datagenerator.double
Gets the perturbation fraction.Returns the revision string.boolean
Return if single mode is set for the given data generator mode depends on option setting and or generator type.Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.Returns a string describing this data generator.Returns an enumeration describing the available options.static void
Main method for executing this class.Returns the tip text for this propertyvoid
setBalanceClass
(boolean value) Sets whether the class is balanced.void
setFunction
(SelectedTag value) Sets the function for generating the data.void
setOptions
(String[] options) Parses a list of options for this object.void
setPerturbationFraction
(double value) Sets the perturbation fraction.Methods inherited from class weka.datagenerators.ClassificationGenerator
getNumExamples, numExamplesTipText, setNumExamples
Methods inherited from class weka.datagenerators.DataGenerator
debugTipText, defaultOutput, enumToVector, formatTipText, getDatasetFormat, getDebug, getEpilogue, getNumExamplesAct, getOutput, getPrologue, getRandom, getRelationName, getSeed, makeData, outputTipText, randomTipText, relationNameTipText, runDataGenerator, seedTipText, setDatasetFormat, setDebug, setOutput, setRandom, setRelationName, setSeed
-
Field Details
-
FUNCTION_1
public static final int FUNCTION_1function 1- See Also:
-
FUNCTION_2
public static final int FUNCTION_2function 2- See Also:
-
FUNCTION_3
public static final int FUNCTION_3function 3- See Also:
-
FUNCTION_4
public static final int FUNCTION_4function 4- See Also:
-
FUNCTION_5
public static final int FUNCTION_5function 5- See Also:
-
FUNCTION_6
public static final int FUNCTION_6function 6- See Also:
-
FUNCTION_7
public static final int FUNCTION_7function 7- See Also:
-
FUNCTION_8
public static final int FUNCTION_8function 8- See Also:
-
FUNCTION_9
public static final int FUNCTION_9function 9- See Also:
-
FUNCTION_10
public static final int FUNCTION_10function 10- See Also:
-
FUNCTION_TAGS
the funtion tags
-
-
Constructor Details
-
Agrawal
public Agrawal()initializes the generator with default values
-
-
Method Details
-
globalInfo
Returns a string describing this data generator.- Returns:
- a description of the data generator suitable for displaying in the explorer/experimenter gui
-
getTechnicalInformation
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.- Specified by:
getTechnicalInformation
in interfaceTechnicalInformationHandler
- Returns:
- the technical information about this class
-
listOptions
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceOptionHandler
- Overrides:
listOptions
in classClassificationGenerator
- Returns:
- an enumeration of all the available options
-
setOptions
Parses a list of options for this object. Valid options are:-h Prints this help.
-o <file> The name of the output file, otherwise the generated data is printed to stdout.
-r <name> The name of the relation.
-d Whether to print debug informations.
-S The seed for random function (default 1)
-n <num> The number of examples to generate (default 100)
-F <num> The function to use for generating the data. (default 1)
-B Whether to balance the class.
-P <num> The perturbation factor. (default 0.05)
- Specified by:
setOptions
in interfaceOptionHandler
- Overrides:
setOptions
in classClassificationGenerator
- Parameters:
options
- the list of options as an array of strings- Throws:
Exception
- if an option is not supported
-
getOptions
Gets the current settings of the datagenerator.- Specified by:
getOptions
in interfaceOptionHandler
- Overrides:
getOptions
in classClassificationGenerator
- Returns:
- an array of strings suitable for passing to setOptions
- See Also:
-
DataGenerator.removeBlacklist(String[])
-
getFunction
Gets the function for generating the data.- Returns:
- the function.
- See Also:
-
setFunction
Sets the function for generating the data.- Parameters:
value
- the function.- See Also:
-
functionTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getBalanceClass
public boolean getBalanceClass()Gets whether the class is balanced.- Returns:
- whether the class is balanced.
-
setBalanceClass
public void setBalanceClass(boolean value) Sets whether the class is balanced.- Parameters:
value
- whether to balance the class.
-
balanceClassTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getPerturbationFraction
public double getPerturbationFraction()Gets the perturbation fraction.- Returns:
- the perturbation fraction.
-
setPerturbationFraction
public void setPerturbationFraction(double value) Sets the perturbation fraction.- Parameters:
value
- the perturbation fraction.
-
perturbationFractionTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getSingleModeFlag
Return if single mode is set for the given data generator mode depends on option setting and or generator type.- Specified by:
getSingleModeFlag
in classDataGenerator
- Returns:
- single mode flag
- Throws:
Exception
- if mode is not set yet
-
defineDataFormat
Initializes the format for the dataset produced. Must be called before the generateExample or generateExamples methods are used. Re-initializes the random number generator with the given seed.- Overrides:
defineDataFormat
in classDataGenerator
- Returns:
- the format for the dataset
- Throws:
Exception
- if the generating of the format failed- See Also:
-
generateExample
Generates one example of the dataset.- Specified by:
generateExample
in classDataGenerator
- Returns:
- the generated example
- Throws:
Exception
- if the format of the dataset is not yet definedException
- if the generator only works with generateExamples which means in non single mode
-
generateExamples
Generates all examples of the dataset. Re-initializes the random number generator with the given seed, before generating instances.- Specified by:
generateExamples
in classDataGenerator
- Returns:
- the generated dataset
- Throws:
Exception
- if the format of the dataset is not yet definedException
- if the generator only works with generateExample, which means in single mode- See Also:
-
generateStart
Generates a comment string that documentates the data generator. By default this string is added at the beginning of the produced output as ARFF file type, next after the options.- Specified by:
generateStart
in classDataGenerator
- Returns:
- string contains info about the generated rules
-
generateFinished
Generates a comment string that documentats the data generator. By default this string is added at the end of theproduces output as ARFF file type.- Specified by:
generateFinished
in classDataGenerator
- Returns:
- string contains info about the generated rules
- Throws:
Exception
- if the generating of the documentaion fails
-
getRevision
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Returns:
- the revision
-
main
Main method for executing this class.- Parameters:
args
- should contain arguments for the data producer:
-