org.writersforge.catalan.text.extractors
Class TokenSplitter

java.lang.Object
  extended byorg.writersforge.catalan.text.extractors.TokenSplitter
All Implemented Interfaces:
ITextExtractor

public class TokenSplitter
extends java.lang.Object
implements ITextExtractor

Text splitter which splits documents each time it encounters a static token.

Author:
jsheets

Constructor Summary
TokenSplitter(java.lang.String[] tokens, boolean keepDelimiters)
          Creates a new instance of TokenSplitter.
 
Method Summary
 java.lang.String[] extractText(java.lang.String text)
          Extracts fragments of text from the input text document.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TokenSplitter

public TokenSplitter(java.lang.String[] tokens,
                     boolean keepDelimiters)
Creates a new instance of TokenSplitter.

Parameters:
tokens - literal tokens for splitting document
keepDelimiters - true to include delimiters as separate text nodes
Method Detail

extractText

public java.lang.String[] extractText(java.lang.String text)
Extracts fragments of text from the input text document.

This implementation splits the document each time it hits a token.

Specified by:
extractText in interface ITextExtractor
Parameters:
text - input text document
Returns:
extracted text fragments