Class AlphabeticTokenizer

java.lang.Object
weka.core.tokenizers.Tokenizer
weka.core.tokenizers.AlphabeticTokenizer
All Implemented Interfaces:
Serializable, Enumeration<String>, OptionHandler, RevisionHandler

public class AlphabeticTokenizer extends Tokenizer
Alphabetic string tokenizer, tokens are to be formed only from contiguous alphabetic sequences.

Version:
$Revision: 10203 $
Author:
Asrhaf M. Kibriya (amk14@cs.waikato.ac.nz), FracPete (fracpete at waikato dot ac dot nz)
See Also:
  • Constructor Details

    • AlphabeticTokenizer

      public AlphabeticTokenizer()
  • Method Details

    • globalInfo

      public String globalInfo()
      Returns a string describing the stemmer
      Specified by:
      globalInfo in class Tokenizer
      Returns:
      a description suitable for displaying in the explorer/experimenter gui
    • hasMoreElements

      public boolean hasMoreElements()
      returns whether there are more elements still
      Specified by:
      hasMoreElements in interface Enumeration<String>
      Specified by:
      hasMoreElements in class Tokenizer
      Returns:
      true if there are still more elements
    • nextElement

      public String nextElement()
      returns the next element
      Specified by:
      nextElement in interface Enumeration<String>
      Specified by:
      nextElement in class Tokenizer
      Returns:
      the next element
    • tokenize

      public void tokenize(String s)
      Sets the string to tokenize. Tokenization happens immediately.
      Specified by:
      tokenize in class Tokenizer
      Parameters:
      s - the string to tokenize
    • getRevision

      public String getRevision()
      Returns the revision string.
      Returns:
      the revision
    • main

      public static void main(String[] args)
      Runs the tokenizer with the given options and strings to tokenize. The tokens are printed to stdout.
      Parameters:
      args - the commandline options and strings to tokenize