org.writersforge.catalan.text.extractors
Class BraceMatchSplitter

java.lang.Object
  extended byorg.writersforge.catalan.text.extractors.BraceMatchSplitter
All Implemented Interfaces:
ITextExtractor

public class BraceMatchSplitter
extends java.lang.Object
implements ITextExtractor

Text splitter which divides a text document according to matching pairs of open/close brace tokens. The splitter manages nested braces to ensure it finds the proper closing brace. For example, given the text document "one (two (three) four) five)", and braces of "(" and ")", the brace matcher would split the text into the fragments "one ", "(two (three) four)", and " five)". It would not match the too-early closing brace after "three", nor the too-late unmatchable brace after "five".

Author:
jsheets

Constructor Summary
BraceMatchSplitter()
          Creates a new instance of BraceMatchSplitter.
BraceMatchSplitter(java.lang.String openBrace, java.lang.String closeBrace)
          Creates a new instance of BraceMatchSplitter with custom braces.
 
Method Summary
 java.lang.String[] extractText(java.lang.String text)
          Extracts fragments of text from the input text document.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

BraceMatchSplitter

public BraceMatchSplitter()
Creates a new instance of BraceMatchSplitter.


BraceMatchSplitter

public BraceMatchSplitter(java.lang.String openBrace,
                          java.lang.String closeBrace)
Creates a new instance of BraceMatchSplitter with custom braces. Braces will typically be parentheses, curly braces, or something similar, but can technically be anything, including multicharacter strings.

Parameters:
openBrace - the starting brace token
closeBrace - the ending brace token
Method Detail

extractText

public java.lang.String[] extractText(java.lang.String text)
Extracts fragments of text from the input text document.

This implementation extracts one fragment of text for each parenthetical regular expression group. If the expression has no groups, this method will return an empty array.

Specified by:
extractText in interface ITextExtractor
Parameters:
text - input text document
Returns:
extracted text fragments