Package weka.gui.beans
Class SubstringLabelerRules
java.lang.Object
weka.gui.beans.SubstringLabelerRules
- All Implemented Interfaces:
Serializable
,EnvironmentHandler
Manages a list of match rules for labeling strings. Also has methods for
determining the output structure with respect to a set of rules and for
constructing output instances that have been labeled according to the rules.
- Version:
- $Revision: 12232 $
- Author:
- Mark Hall (mhall{[at]}pentaho{[dot]}com)
- See Also:
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic class
Inner class encapsulating the logic for matching -
Field Summary
Modifier and TypeFieldDescriptionstatic final String
Separator for match rules in the internal representation -
Constructor Summary
ConstructorDescriptionSubstringLabelerRules
(String matchDetails, String newAttName, boolean consumeNonMatching, boolean nominalBinary, Instances inputStructure, String statusMessagePrefix, Logger log, Environment env) ConstructorSubstringLabelerRules
(String matchDetails, String newAttName, Instances inputStructure) Constructor. -
Method Summary
Modifier and TypeMethodDescriptionboolean
Get whether to consume non matching instances.Get the input structureGet the name to use for the new attribute that is addedboolean
Get whether to create a nominal binary attribute in the case when the user has not supplied an explicit label to use for each rule.Get the output structuremakeOutputInstance
(Instance inputI, boolean batch) Process and input instance and return an output instancematchRulesFromInternal
(String matchDetails, Instances inputStructure, String statusMessagePrefix, Logger log, Environment env) Get a list of match rules from an internally encoded match specificationvoid
setConsumeNonMatching
(boolean n) Set whether to consume non matching instances.void
Set environment variables to use.void
setNewAttributeName
(String newName) Set the name to use for the new attribute that is addedvoid
setNominalBinary
(boolean n) Set whether to create a nominal binary attribute in the case when the user has not supplied an explicit label to use for each rule.
-
Field Details
-
MATCH_RULE_SEPARATOR
Separator for match rules in the internal representation- See Also:
-
-
Constructor Details
-
SubstringLabelerRules
public SubstringLabelerRules(String matchDetails, String newAttName, boolean consumeNonMatching, boolean nominalBinary, Instances inputStructure, String statusMessagePrefix, Logger log, Environment env) throws Exception Constructor- Parameters:
matchDetails
- the internally encoded match details stringnewAttName
- the name of the new attribute that will be the labelconsumeNonMatching
- true if non-matching instances should be consumednominalBinary
- true if, in the case where no user labels have been supplied, the new attribute should be a nominal binary one rather than numericinputStructure
- the incoming instances structurestatusMessagePrefix
- an optional status message prefix string for logginglog
- the log to use (may be null)env
- environment variables- Throws:
Exception
-
SubstringLabelerRules
public SubstringLabelerRules(String matchDetails, String newAttName, Instances inputStructure) throws Exception Constructor. Sets consume non matching to false and nominal binary to false. Initializes with system-wide environment variables. Initializes with no status message prefix and no log.- Parameters:
matchDetails
- the internally encoded match details string.newAttName
- the name of the new attribute that will be the labelinputStructure
- the incoming instances structure- Throws:
Exception
-
-
Method Details
-
setConsumeNonMatching
public void setConsumeNonMatching(boolean n) Set whether to consume non matching instances. If false, then they will be passed through unaltered.- Parameters:
n
- true then non-matching instances will be consumed (and only matching, and thus labelled, instances will be output)
-
getConsumeNonMatching
public boolean getConsumeNonMatching()Get whether to consume non matching instances. If false, then they will be passed through unaltered.- Returns:
- true then non-matching instances will be consumed (and only matching, and thus labelled, instances will be output)
-
setNominalBinary
public void setNominalBinary(boolean n) Set whether to create a nominal binary attribute in the case when the user has not supplied an explicit label to use for each rule. If no labels are provided, then the output attribute is a binary indicator one (i.e. a rule matched or it didn't). This option allows that binary indicator to be coded as nominal rather than numeric- Parameters:
n
- true if a binary indicator attribute should be nominal rather than numeric
-
getNominalBinary
public boolean getNominalBinary()Get whether to create a nominal binary attribute in the case when the user has not supplied an explicit label to use for each rule. If no labels are provided, then the output attribute is a binary indicator one (i.e. a rule matched or it didn't). This option allows that binary indicator to be coded as nominal rather than numeric- Returns:
- true if a binary indicator attribute should be nominal rather than numeric
-
getOutputStructure
Get the output structure- Returns:
- the structure of the output instances
-
getInputStructure
Get the input structure- Returns:
- the structure of the input instances
-
setNewAttributeName
Set the name to use for the new attribute that is added- Parameters:
newName
- the name to use
-
getNewAttributeName
Get the name to use for the new attribute that is added- Returns:
- the name to use
-
setEnvironment
Description copied from interface:EnvironmentHandler
Set environment variables to use.- Specified by:
setEnvironment
in interfaceEnvironmentHandler
- Parameters:
env
- the environment variables to use
-
matchRulesFromInternal
public static List<SubstringLabelerRules.SubstringLabelerMatchRule> matchRulesFromInternal(String matchDetails, Instances inputStructure, String statusMessagePrefix, Logger log, Environment env) Get a list of match rules from an internally encoded match specification- Parameters:
matchDetails
- the internally encoded specification of the match rulesinputStructure
- the input instances structurestatusMessagePrefix
- an optional status message prefix for logginglog
- the log to useenv
- environment variables- Returns:
- a list of match rules
-
makeOutputInstance
Process and input instance and return an output instance- Parameters:
inputI
- the incoming instancebatch
- whether this is being processed as part of a batch of instances- Returns:
- the output instance
- Throws:
Exception
- if the output structure has not yet been determined
-