java.lang.Object

weka.experiment.RandomSplitResultProducer

All Implemented Interfaces:: Serializable, AdditionalMeasureProducer, OptionHandler, RevisionHandler, ResultProducer

public class RandomSplitResultProducer extends Object implements ResultProducer, OptionHandler, AdditionalMeasureProducer, RevisionHandler

Generates a single train/test split and calls the appropriate SplitEvaluator to generate some results.

Valid options are:

 -P <percent>
  The percentage of instances to use for training.
  (default 66)

 -D
 Save raw split evaluator output.

 -O <file/directory name/path>
  The filename where raw output will be stored.
  If a directory name is specified then then individual
  outputs will be gzipped, otherwise all output will be
  zipped to the named file. Use in conjuction with -D. (default splitEvalutorOut.zip)

 -W <class name>
  The full class name of a SplitEvaluator.
  eg: weka.experiment.ClassifierSplitEvaluator

 -R
  Set when data is not to be randomized and the data sets' size.
  Is not to be determined via probabilistic rounding.

 Options specific to split evaluator weka.experiment.ClassifierSplitEvaluator:

 -W <class name>
  The full class name of the classifier.
  eg: weka.classifiers.bayes.NaiveBayes

 -C <index>
  The index of the class for which IR statistics
  are to be output. (default 1)

 -I <index>
  The index of an attribute to output in the
  results. This attribute should identify an
  instance in order to know which instances are
  in the test set of a cross validation. if 0
  no output (default 0).

 -P
  Add target and prediction columns to the result
  for each fold.

 Options specific to classifier weka.classifiers.rules.ZeroR:

 -D
  If set, classifier is run in debug mode and
  may output additional info to the console

All options after -- will be passed to the split evaluator.

Version:

$Revision: 10203 $

Author:

Len Trigg (trigg@cs.waikato.ac.nz)

See Also:

Serialized Form

Field Summary

Fields

Modifier and Type

Field

Description

static String

DATASET_FIELD_NAME

The name of the key field containing the dataset name

static String

RUN_FIELD_NAME

The name of the key field containing the run number

static String

TIMESTAMP_FIELD_NAME

The name of the result field containing the timestamp
Constructor Summary

Constructors

Constructor

Description

RandomSplitResultProducer()
Method Summary

Modifier and Type

Method

Description

void

doRun(int run)

Gets the results for a specified run number.

void

doRunKeys(int run)

Gets the keys for a specified run number.

Enumeration<String>

enumerateMeasures()

Returns an enumeration of any additional measure names that might be in the SplitEvaluator

String

getCompatibilityState()

Gets a description of the internal settings of the result producer, sufficient for distinguishing a ResultProducer instance from another with different settings (ignoring those settings set through this interface).

String[]

getKeyNames()

Gets the names of each of the columns produced for a single run.

Object[]

getKeyTypes()

Gets the data types of each of the columns produced for a single run.

double

getMeasure(String additionalMeasureName)

Returns the value of the named measure

String[]

getOptions()

Gets the current settings of the result producer.

File

getOutputFile()

Get the value of OutputFile.

boolean

getRandomizeData()

Get if dataset is to be randomized

boolean

getRawOutput()

Get if raw split evaluator output is to be saved

String[]

getResultNames()

Gets the names of each of the columns produced for a single run.

Object[]

getResultTypes()

Gets the data types of each of the columns produced for a single run.

String

getRevision()

Returns the revision string.

SplitEvaluator

getSplitEvaluator()

Get the SplitEvaluator.

static Double

getTimestamp()

Gets a Double representing the current date and time.

double

getTrainPercent()

Get the value of TrainPercent.

String

globalInfo()

Returns a string describing this result producer

Enumeration<Option>

listOptions()

Returns an enumeration describing the available options..

String

outputFileTipText()

Returns the tip text for this property

void

postProcess()

Perform any postprocessing.

void

preProcess()

Prepare to generate results.

String

randomizeDataTipText()

Returns the tip text for this property

String

rawOutputTipText()

Returns the tip text for this property

void

setAdditionalMeasures(String[] additionalMeasures)

Set a list of method names for additional measures to look for in SplitEvaluators.

void

setInstances(Instances instances)

Sets the dataset that results will be obtained for.

void

setOptions(String[] options)

Parses a given list of options.

void

setOutputFile(File newOutputFile)

Set the value of OutputFile.

void

setRandomizeData(boolean d)

Set to true if dataset is to be randomized

void

setRawOutput(boolean d)

Set to true if raw split evaluator output is to be saved

void

setResultListener(ResultListener listener)

Sets the object to send results of each run to.

void

setSplitEvaluator(SplitEvaluator newSplitEvaluator)

Set the SplitEvaluator.

void

setTrainPercent(double newTrainPercent)

Set the value of TrainPercent.

String

splitEvaluatorTipText()

Returns the tip text for this property

String

toString()

Gets a text descrption of the result producer.

String

trainPercentTipText()

Returns the tip text for this property

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait

Field Details
- DATASET_FIELD_NAME
  
  public static String DATASET_FIELD_NAME
  
  The name of the key field containing the dataset name
- RUN_FIELD_NAME
  
  public static String RUN_FIELD_NAME
  
  The name of the key field containing the run number
- TIMESTAMP_FIELD_NAME
  
  public static String TIMESTAMP_FIELD_NAME
  
  The name of the result field containing the timestamp
Constructor Details
- RandomSplitResultProducer
  
  public RandomSplitResultProducer()
Method Details
- globalInfo
  
  public String globalInfo()
  
  Returns a string describing this result producer
  
  Returns:
  
  a description of the result producer suitable for displaying in the explorer/experimenter gui
- setInstances
  
  public void setInstances(Instances instances)
  
  Sets the dataset that results will be obtained for.
  
  Specified by:
  
  setInstances in interface ResultProducer
  
  Parameters:
  
  instances - a value of type 'Instances'.
- setAdditionalMeasures
  
  public void setAdditionalMeasures(String[] additionalMeasures)
  
  Set a list of method names for additional measures to look for in SplitEvaluators. This could contain many measures (of which only a subset may be produceable by the current SplitEvaluator) if an experiment is the type that iterates over a set of properties.
  
  Specified by:
  
  setAdditionalMeasures in interface ResultProducer
  
  Parameters:
  
  additionalMeasures - an array of measure names, null if none
- enumerateMeasures
  
  public Enumeration<String> enumerateMeasures()
  
  Returns an enumeration of any additional measure names that might be in the SplitEvaluator
  
  Specified by:
  
  enumerateMeasures in interface AdditionalMeasureProducer
  
  Returns:
  
  an enumeration of the measure names
- getMeasure
  
  public double getMeasure(String additionalMeasureName)
  
  Returns the value of the named measure
  
  Specified by:
  
  getMeasure in interface AdditionalMeasureProducer
  
  Parameters:
  
  additionalMeasureName - the name of the measure to query for its value
  
  Returns:
  
  the value of the named measure
  
  Throws:
  
  IllegalArgumentException - if the named measure is not supported
- setResultListener
  
  public void setResultListener(ResultListener listener)
  
  Sets the object to send results of each run to.
  
  Specified by:
  
  setResultListener in interface ResultProducer
  
  Parameters:
  
  listener - a value of type 'ResultListener'
- getTimestamp
  
  public static Double getTimestamp()
  
  Gets a Double representing the current date and time. eg: 1:46pm on 20/5/1999 -> 19990520.1346
  
  Returns:
  
  a value of type Double
- preProcess
  
  public void preProcess() throws Exception
  
  Prepare to generate results.
  
  Specified by:
  
  preProcess in interface ResultProducer
  
  Throws:
  
  Exception - if an error occurs during preprocessing.
- postProcess
  
  public void postProcess() throws Exception
  
  Perform any postprocessing. When this method is called, it indicates that no more requests to generate results for the current experiment will be sent.
  
  Specified by:
  
  postProcess in interface ResultProducer
  
  Throws:
  
  Exception - if an error occurs
- doRunKeys
  
  public void doRunKeys(int run) throws Exception
  
  Gets the keys for a specified run number. Different run numbers correspond to different randomizations of the data. Keys produced should be sent to the current ResultListener
  
  Specified by:
  
  doRunKeys in interface ResultProducer
  
  Parameters:
  
  run - the run number to get keys for.
  
  Throws:
  
  Exception - if a problem occurs while getting the keys
- doRun
  
  public void doRun(int run) throws Exception
  
  Gets the results for a specified run number. Different run numbers correspond to different randomizations of the data. Results produced should be sent to the current ResultListener
  
  Specified by:
  
  doRun in interface ResultProducer
  
  Parameters:
  
  run - the run number to get results for.
  
  Throws:
  
  Exception - if a problem occurs while getting the results
- getKeyNames
  
  public String[] getKeyNames()
  
  Gets the names of each of the columns produced for a single run. This method should really be static.
  
  Specified by:
  
  getKeyNames in interface ResultProducer
  
  Returns:
  
  an array containing the name of each column
- getKeyTypes
  
  public Object[] getKeyTypes()
  
  Gets the data types of each of the columns produced for a single run. This method should really be static.
  
  Specified by:
  
  getKeyTypes in interface ResultProducer
  
  Returns:
  
  an array containing objects of the type of each column. The objects should be Strings, or Doubles.
- getResultNames
  
  public String[] getResultNames()
  
  Gets the names of each of the columns produced for a single run. This method should really be static.
  
  Specified by:
  
  getResultNames in interface ResultProducer
  
  Returns:
  
  an array containing the name of each column
- getResultTypes
  
  public Object[] getResultTypes()
  
  Gets the data types of each of the columns produced for a single run. This method should really be static.
  
  Specified by:
  
  getResultTypes in interface ResultProducer
  
  Returns:
  
  an array containing objects of the type of each column. The objects should be Strings, or Doubles.
- getCompatibilityState
  
  public String getCompatibilityState()
  
  Gets a description of the internal settings of the result producer, sufficient for distinguishing a ResultProducer instance from another with different settings (ignoring those settings set through this interface). For example, a cross-validation ResultProducer may have a setting for the number of folds. For a given state, the results produced should be compatible. Typically if a ResultProducer is an OptionHandler, this string will represent the command line arguments required to set the ResultProducer to that state.
  
  Specified by:
  
  getCompatibilityState in interface ResultProducer
  
  Returns:
  
  the description of the ResultProducer state, or null if no state is defined
- outputFileTipText
  
  public String outputFileTipText()
  
  Returns the tip text for this property
  
  Returns:
  
  tip text for this property suitable for displaying in the explorer/experimenter gui
- getOutputFile
  
  public File getOutputFile()
  
  Get the value of OutputFile.
  
  Returns:
  
  Value of OutputFile.
- setOutputFile
  
  public void setOutputFile(File newOutputFile)
  
  Set the value of OutputFile.
  
  Parameters:
  
  newOutputFile - Value to assign to OutputFile.
- randomizeDataTipText
  
  public String randomizeDataTipText()
  
  Returns the tip text for this property
  
  Returns:
  
  tip text for this property suitable for displaying in the explorer/experimenter gui
- getRandomizeData
  
  public boolean getRandomizeData()
  
  Get if dataset is to be randomized
  
  Returns:
  
  true if dataset is to be randomized
- setRandomizeData
  
  public void setRandomizeData(boolean d)
  
  Set to true if dataset is to be randomized
  
  Parameters:
  
  d - true if dataset is to be randomized
- rawOutputTipText
  
  public String rawOutputTipText()
  
  Returns the tip text for this property
  
  Returns:
  
  tip text for this property suitable for displaying in the explorer/experimenter gui
- getRawOutput
  
  public boolean getRawOutput()
  
  Get if raw split evaluator output is to be saved
  
  Returns:
  
  true if raw split evalutor output is to be saved
- setRawOutput
  
  public void setRawOutput(boolean d)
  
  Set to true if raw split evaluator output is to be saved
  
  Parameters:
  
  d - true if output is to be saved
- trainPercentTipText
  
  public String trainPercentTipText()
  
  Returns the tip text for this property
  
  Returns:
  
  tip text for this property suitable for displaying in the explorer/experimenter gui
- getTrainPercent
  
  public double getTrainPercent()
  
  Get the value of TrainPercent.
  
  Returns:
  
  Value of TrainPercent.
- setTrainPercent
  
  public void setTrainPercent(double newTrainPercent)
  
  Set the value of TrainPercent.
  
  Parameters:
  
  newTrainPercent - Value to assign to TrainPercent.
- splitEvaluatorTipText
  
  public String splitEvaluatorTipText()
  
  Returns the tip text for this property
  
  Returns:
  
  tip text for this property suitable for displaying in the explorer/experimenter gui
- getSplitEvaluator
  
  public SplitEvaluator getSplitEvaluator()
  
  Get the SplitEvaluator.
  
  Returns:
  
  the SplitEvaluator.
- setSplitEvaluator
  
  public void setSplitEvaluator(SplitEvaluator newSplitEvaluator)
  
  Set the SplitEvaluator.
  
  Parameters:
  
  newSplitEvaluator - new SplitEvaluator to use.
- listOptions
  
  public Enumeration<Option> listOptions()
  
  Returns an enumeration describing the available options..
  
  Specified by:
  
  listOptions in interface OptionHandler
  
  Returns:
  
  an enumeration of all the available options.
- setOptions
  
  public void setOptions(String[] options) throws Exception
  Parses a given list of options.
  Valid options are:
  
  -P <percent> The percentage of instances to use for training. (default 66)
  
  -D Save raw split evaluator output.
  
  -O <file/directory name/path> The filename where raw output will be stored. If a directory name is specified then then individual outputs will be gzipped, otherwise all output will be zipped to the named file. Use in conjuction with -D. (default splitEvalutorOut.zip)
  
  -W <class name> The full class name of a SplitEvaluator. eg: weka.experiment.ClassifierSplitEvaluator
  
  -R Set when data is not to be randomized and the data sets' size. Is not to be determined via probabilistic rounding.
  
  Options specific to split evaluator weka.experiment.ClassifierSplitEvaluator:
  
  -W <class name> The full class name of the classifier. eg: weka.classifiers.bayes.NaiveBayes
  
  -C <index> The index of the class for which IR statistics are to be output. (default 1)
  
  -I <index> The index of an attribute to output in the results. This attribute should identify an instance in order to know which instances are in the test set of a cross validation. if 0 no output (default 0).
  
  -P Add target and prediction columns to the result for each fold.
  
  Options specific to classifier weka.classifiers.rules.ZeroR:
  
  -D If set, classifier is run in debug mode and may output additional info to the console
  All options after -- will be passed to the split evaluator.
  Specified by:
  
  setOptions in interface OptionHandler
  
  Parameters:
  
  options - the list of options as an array of strings
  
  Throws:
  
  Exception - if an option is not supported
- getOptions
  
  public String[] getOptions()
  
  Gets the current settings of the result producer.
  
  Specified by:
  
  getOptions in interface OptionHandler
  
  Returns:
  
  an array of strings suitable for passing to setOptions
- toString
  
  public String toString()
  
  Gets a text descrption of the result producer.
  
  Overrides:
  
  toString in class Object
  
  Returns:
  
  a text description of the result producer.
- getRevision
  
  public String getRevision()
  
  Returns the revision string.
  
  Specified by:
  
  getRevision in interface RevisionHandler
  
  Returns:
  
  the revision

Class RandomSplitResultProducer

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Details

DATASET_FIELD_NAME

RUN_FIELD_NAME

TIMESTAMP_FIELD_NAME

Constructor Details

RandomSplitResultProducer

Method Details

globalInfo

setInstances

setAdditionalMeasures

enumerateMeasures

getMeasure

setResultListener

getTimestamp

preProcess

postProcess

doRunKeys

doRun

getKeyNames

getKeyTypes

getResultNames

getResultTypes

getCompatibilityState

outputFileTipText

getOutputFile

setOutputFile

randomizeDataTipText

getRandomizeData

setRandomizeData

rawOutputTipText

getRawOutput

setRawOutput

trainPercentTipText

getTrainPercent

setTrainPercent

splitEvaluatorTipText

getSplitEvaluator

setSplitEvaluator

listOptions

setOptions

getOptions

toString

getRevision