Package weka.knowledgeflow.steps
Class CrossValidationFoldMaker
java.lang.Object
weka.knowledgeflow.steps.BaseStep
weka.knowledgeflow.steps.CrossValidationFoldMaker
- All Implemented Interfaces:
Serializable
,BaseStepExtender
,Step
@KFStep(name="CrossValidationFoldMaker",
category="Evaluation",
toolTipText="A Step that creates stratified cross-validation folds from incoming data",
iconPath="weka/gui/knowledgeflow/icons/CrossValidationFoldMaker.gif")
public class CrossValidationFoldMaker
extends BaseStep
Step for generating cross-validation splits
- Version:
- $Revision: $
- Author:
- Mark Hall (mhall{[at]}pentaho{[dot]}com)
- See Also:
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionGet a list of incoming connection types that this step can accept.Get the number of folds to createGet a list of outgoing connection types that this step can produce.boolean
Get whether to preserve the order of the input instances when creatinbg the foldsgetSeed()
Get the random seedoutputStructureForConnectionType
(String connectionName) If possible, get the output structure for the named connection type as a header-only set of instances.void
processIncoming
(Data data) Process an incoming data payload (if the step accepts incoming connections)void
setNumFolds
(String folds) Set the number of folds to createvoid
setPreserveOrder
(boolean preserve) Set whether to preserve the order of the input instances when creatinbg the foldsvoid
Set the random seed to usevoid
stepInit()
Initialize the step.Methods inherited from class weka.knowledgeflow.steps.BaseStep
environmentSubstitute, getCustomEditorForStep, getDefaultSettings, getInteractiveViewers, getInteractiveViewersImpls, getName, getStepManager, globalInfo, isResourceIntensive, isStopRequested, outputStructureForConnectionType, setName, setStepIsResourceIntensive, setStepManager, setStepMustRunSingleThreaded, start, stepMustRunSingleThreaded, stop
-
Constructor Details
-
CrossValidationFoldMaker
public CrossValidationFoldMaker()
-
-
Method Details
-
setNumFolds
@OptionMetadata(displayName="Number of folds", description="THe number of folds to create", displayOrder=0) public void setNumFolds(String folds) Set the number of folds to create- Parameters:
folds
- the number of folds to create
-
getNumFolds
Get the number of folds to create- Returns:
- the number of folds to create
-
setPreserveOrder
@OptionMetadata(displayName="Preserve instances order", description="Preserve the order of instances rather than randomly shuffling", displayOrder=1) public void setPreserveOrder(boolean preserve) Set whether to preserve the order of the input instances when creatinbg the folds- Parameters:
preserve
- true to preserve the order
-
getPreserveOrder
public boolean getPreserveOrder()Get whether to preserve the order of the input instances when creatinbg the folds- Returns:
- true to preserve the order
-
setSeed
@OptionMetadata(displayName="Random seed", description="The random seed to use for shuffling", displayOrder=3) public void setSeed(String seed) Set the random seed to use- Parameters:
seed
- the random seed to use
-
getSeed
Get the random seed- Returns:
- the random seed
-
stepInit
Initialize the step.- Throws:
WekaException
- if a problem occurs during initialization
-
processIncoming
Process an incoming data payload (if the step accepts incoming connections)- Specified by:
processIncoming
in interfaceBaseStepExtender
- Specified by:
processIncoming
in interfaceStep
- Overrides:
processIncoming
in classBaseStep
- Parameters:
data
- the payload to process- Throws:
WekaException
- if a problem occurs
-
getIncomingConnectionTypes
Get a list of incoming connection types that this step can accept. Ideally (and if appropriate), this should take into account the state of the step and any existing incoming connections. E.g. a step might be able to accept one (and only one) incoming batch data connection.- Returns:
- a list of incoming connections that this step can accept given its current state
-
getOutgoingConnectionTypes
Get a list of outgoing connection types that this step can produce. Ideally (and if appropriate), this should take into account the state of the step and the incoming connections. E.g. depending on what incoming connection is present, a step might be able to produce a trainingSet output, a testSet output or neither, but not both.- Returns:
- a list of outgoing connections that this step can produce
-
outputStructureForConnectionType
If possible, get the output structure for the named connection type as a header-only set of instances. Can return null if the specified connection type is not representable as Instances or cannot be determined at present.- Specified by:
outputStructureForConnectionType
in interfaceStep
- Overrides:
outputStructureForConnectionType
in classBaseStep
- Parameters:
connectionName
- the name of the connection type to get the output structure for- Returns:
- the output structure as a header-only Instances object
- Throws:
WekaException
- if a problem occurs
-