|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Object | +--org.writersforge.catalan.transform.Transformer
A XML driven data transformer for converting data from one format to another. Performs a series of operations against a List of input data objects. The input data will change content and size as it goes through the chain of transforms. Currently supports the following processors:
<transform>Simple text search and replace. Replaces all occurrences of the 'oldtext' attribute in the input data with the value of the 'newtext' attribute. Corresponds to the
TextReplacer processor. The
example below will replace all hyphen characters with "X" characters. So
input data with three text nodes [ "a--ple", "------", "e-ample" ] will
become [ "aXXple", "XXXXXX", "eXample" ].
<replace oldtext='-' newtext='X'/>Simple text search and replace with limited replacements. Only replaces the first
count characters in each node. The count
resets for each new node. The example below would convert the input data
[ "a--ple", "------", "e-ample" ] to [ "aXXple", "XX----", "eXample"
].
<replace oldtext='-' newtext='X' count='2'/>Variable replacement. Replaces UNIX-style variables with static text. Variables take the form of "${variable}". The extra markup is part of the variable, and is removed when the variable is replaced. For example, the text node "The ${version} version of ${product}" would become "The 0.1.3 version of Catalan" when run through the transform below.
<lookup>
<var name='product' text='Catalan'/>
<var name='version' text='0.1.3'/>
</lookup>
Variable replacement with alternate markup. Replaces
custom-style variables. The start-token and
end-token attributes define the alternate markup for the
variables. For example, the text node "The @version@ version of
@product@" would become "The 0.1.3 version of Catalan" when run
through the transform below.
<lookup start-token='@' end-token='@'>
<var name='product' text='Catalan'/>
<var name='version' text='0.1.3'/>
</lookup>
Token splitter. Chops up input data by the given tokens. The
example below would convert the input data [ "one,two::three;", ":four:" ]
into [ "one", "two", "three", "four" ].
<tokenize>
<token>,</token>
<token>;</token>
<token>:</token>
</tokenize>
Token splitter with tokens. Chops up input data by the given
tokens and keeps the tokens in the output. The example below would
convert the input data [ "one,two::three;", ":four:" ] into [ "one", ",",
"two", ":", ":", "three", ";", ":", "four", ":" ].
<tokenize include-delimiters='yes'>
<token>,</token>
<token>;</token>
<token>:</token>
</tokenize>
Input data concatenator. Converts all input data nodes into
String form and concatenates them all together into a single String.
Converts Datum trees to XML with DatumWriter, and calls
String.valueOf() on everything else. Output from a simple
<concat> transform will always be a single node with String data. For
example, given input data of [ "one", new Integer(2), <three/>,
<four><five/></four> ], where the XML data is actually a Datum
tree, the transform below would result in literal String output of [
"one2<three/>\n<four>\n <five/>\n</four>\n" ]. The extra
whitespace is a by-product of DatumWriter.
<concat/>Input data concatenator with limited node count. Converts input data nodes into String form until it reaches the specified node count or runs out of input data. Any unprocessed nodes are passed to the output untouched. For example, given the input data in the example above, the following example would create output of [ "one2<three/>\n", <four><five/></four> ]. The first three input data nodes are concatenated and the fourth node, the second Datum tree, is passed through as a non-Stringified Datum tree.
<concat count='3'/>Whitespace normalizer. Converts all consecutive spans of whitespace into single space characters. By default, the space (" "), tab ("\t"), and return ("\n" and "\r") characters are considered to be whitespace. The string " \t\t white space \r\n " becomes " white space ".
<normalize/>Normalizer with custom whitespace. Resolves all spans of custom tokens into single custom output characters. If any <token> elements are defined, the default whitespace tokens no longer apply. You can change the token the whitespace resolves to, with the
resolver attribute. In the example below, all consecutive
spans of space and tab characters will resolve to the '#' character. Thus,
the string " \t white space \t\n " would
become "#white#space#\n#".
<normalize resolver='#'>
<token> </token>
<token>\t</token>
</normalize>
Whitespace normalizer with custom exclusion areas. The exclusion
areas can be delimited by a single token, with the delim
attribute, or with different start and end tokens, using the
start-delim and end-delim attributes. The
transform below would convert the text
"--one--|---two--|--three----[--four-five---]--six" into
"XoneX|---two--|XthreeX[--four-five---]Xsix", normalizing anything not in
an exclusion area delimited by '|...|' or '[...]'.
<normalize resolver='X'>
<token>-</token>
<exclude delim='|'/>
<exclude start-delim='[' end-delim=']'/>
</normalize>
Object to ASCII converter. This processor packs simple Java
objects into a packed ASCII data string according to the field
specification described in AsciiFieldManager. It does its best to
convert the input data objects into the field types in the spec. Any
input data that doesn't fit in the spec are passed through, untouched.
Input data that does not fit into its field will be clipped, which will
unfortunately result in data loss. Input data nodes are not consumed on
padding fields. For example, given the input data [ 12345, "one", "two",
"three", "four" ], the processor and field spec below would result in
output data of [ "1234 onetwothr ", "four" ]. The default
padding for 'x' fields is the space character.
<to-ascii spec='4i 2x 3s[3] 3x'/>Object to ASCII converter with custom padding. The padding attribute lets you change the default padding. The padding string is repeated across all padding fields. Given the input data from the previous example, the output data would be [ "1234ABonetwothrCDA", "four" ].
<to-ascii spec='4i 2x 3s[3] 3x' padding='ABCD'/>ASCII to Object converter. This processor performs the same transformation as <to-ascii> except in reverse. The input data is one or more blocks of packed ASCII data, and the output is the set of exploded Java objects and arrays from all the input data. For example, given the input data [ "1234--onetwothrxxx" ], the output data would be [ 1234, new String[] { "one", "two", "thr" } ]. The padding "--" and "xxx" are ignored. The input data [ "1234..onetwothrABC" ] would result in exactly the same output data.
<from-ascii spec='4i 2x 3s[3] 3x'/>Object to Datum converter. This processor channels input data nodes into an XML structure (actually a Bellows Datum tree), based on a push/pop stack of formatting directives. The <start-element/> element pulls the next input data node, converts it to a string, and uses that as the element name. The <end-element/> directive closes the current element. It's possible to nest elements to arbitrary depths. The <attribute/> element pulls the next two input nodes, using the first for the attribute name and the second for the attribute value. Finally, the <pcdata/> element appends the current input node to the PCDATA content of the current element. For example, the input data [ "one", "two", "three", "four", "five", "six" ] processed by the below transform would create a Datum tree corresponding to the XML: "<one three='four' five='six'>two</one>". If the input data contains more nodes than the <to-xml> transform uses, the extra nodes will be copied directly to the output, after the Datum tree.
<to-xml>
<start-element/>
<pcdata/>
<attribute/>
<attribute/>
<end-element/>
</to-xml>
Object to Datum converter with static content. By default, the
<to-xml> transform pulls all of its non-markup content from the input
data nodes. However, it is possible to override that content with static
text inside the transform. The element name can be set with the 'name'
attribute; the attribute content can be set with the 'name' and 'value'
attributes; and PCDATA content can be set by simply including it as PCDATA
in the <pcdata> element. Statically set values do not consume input
data. Thus, input data of [ "one", "two", "three", "four", "five", "six"
] processed with the below transform would result in output data of [
"<staticroot attr1='two' attr2='staticval'>one--three</staticroot>",
"four", "five", "six" ]. The unused input nodes are passed through to the
output.
<to-xml>
<start-element name='staticroot'/>
<pcdata/>
<attribute name='attr1'/>
<attribute name='attr2' value='staticval'/>
<pcdata>--</pcdata>
<pcdata/>
<end-element/>
</to-xml>
</transform>
Datum to Object converter. The <from-xml> processor
decomposes Datum XML trees into component Java objects. The
<query> elements select which parts of the XML document to operate on;
each <from-xml> processor can contain more than one query, and queries
can be nested inside of each other. Nested queries act upon the set of
Datum objects selected by the parent query, with a relative path. Inside
the query, commands select the content to place in the output. The
<property> command looks up the named XML attribute in all selected
elements. The <type> command places the element name in the output.
The <datum> command copies the Datum object itself into the output.
The <int>, <string>, and <float> commands place static nodes
into the output. Those commands in the example below would result in
new Integer(1), "two", and new
Double(3.3) output nodes.
<from-xml>
<query path='root/child'>
<property name='prop1'/>
<property name='prop2'/>
<int value='1'/>
<string value='two'/>
<float value='3.3'/>
<query path='child/*[@use=yes]'>
<type/>
<property name='id'/>
</query>
</query>
<query path='root/child[2]'>
<datum/>
</query>
</from-xml>
XML to JavaBean converter. Maps an XML document into a JavaBean
instance. Requires two input nodes: the JavaBean class, as either a String
or a Class instance; and a Datum tree. The converter will do its best to
recursively load the XML data into the JavaBean, matching element and
attribute names to JavaBean properties. Supports many different naming
styles, e.g., "my-bean", "my_bean", "MY-BEAN", "MY_BEAN", "MyBean", and
"myBean". Also searches for primitive properties as both child elements
and attributes.
<xml-to-bean/>JavaBean to XML converter. Generates an XML Datum tree from a JavaBean instance, converting JavaBean properties into XML-style element names, e.g., "my-bean", "my-bean-property". By default, creates all properties as nested child elements.
<bean-to-xml/>JavaBean to XML converter with collapsed attributes. Generates an XML Datum tree from a JavaBean instance, storing all primitive JavaBean properties as attributes instead of elements. Still creates child elements for complex properties like nested JavaBeans.
<bean-to-xml collapse='true'/>JavaBean to XML converter with alternate naming styles. Generates an XML Datum tree from a JavaBean instance, using different element and/or attribute (if collapse='true') naming styles. The examples below represent the styles "my-bean", "my-bean", "my_bean", "MY-BEAN", "MY_BEAN", "MyBean", and "myBean", respectively.
<bean-to-xml naming-style='default'/> <bean-to-xml naming-style='lower-hyphen'/> <bean-to-xml naming-style='lower-underscore'/> <bean-to-xml naming-style='upper-hyphen'/> <bean-to-xml naming-style='upper-underscore'/> <bean-to-xml naming-style='case-delim'/> <bean-to-xml naming-style='javabean'/>Composite processors. This is an organizational wrapper which lets you group multiple processors into a single unit. Groups can be nested to arbitrary depths. Plain groups work on all types of data content.
<group>
<concat/>
<normalize/>
</group>
Composite processors with XML filter. This is a composite
processor which lets you select a subset of an XML tree to operate on.
<group select='root/child'>
<xform-rename new-name='new-root'/>
<group select='child/*'>
<xform-rename new-name='grandchild'/>
</group>
</group>
XML processor to insert static element content. Evaluates the
select query and creates a copy of the nested content in each of the
matched query nodes. The example below would place a copy of the full
<newContent>element, including any attributes, into all <child>
elements immediately inside the base <root> element of the input tree.
As with all xform processors, all non-Datum content is passed
through to the output untouched.
<xform-insert select='root/child'>
<newContent>
<subContent1/>
<subContent2 prop='value'/>
</newContent>
</xform-insert>
XML processor to insert static attribute content. Creates static
attributes in all matched query nodes. This example would create a
'newProp' attribute with the value of 'newValue' on all selected <child>
elements.
<xform-insert select='root/child'
attribute="newProp" value="newValue"/>
XML processor to delete element content. Permanently deletes
all selected nodes. This example removes all <child> elements inside the
<root> element.
<xform-delete select='root/child'/>XML processor to delete attribute content. Permanently deletes the named attribute from all selected elements. The transform below removes the 'origProp' attribute from all selected <child> elements.
<xform-delete select='root/child'
attribute='origProp'/>
XML processor to duplicate element content. Creates a new copy of
the selected elements at each element that the 'dest' query selects.
This example would make copies of all selected <child> elements
and put them into each selected <to> element. If more than
one destination node is selected, the processor will make multiple copies
of the same source elements. The transform does not alter the original
content, e.g., the 'root/child' nodes below.
<xform-copy select='root/child' dest='root/to'/>XML processor to duplicate attribute content. Collects all matched attributes and places all of them in each destination node, similar to the element copy processor. If more than one attribute is copied into the same destination node, the second and later attributes are mangled to keep the attribute names unique, by appending numbers to the duplicated attributes. Thus, if the example below matches three 'origProp' attributes in the selected <child> elements, the processor will create the attributes 'origProp', 'origProp2', and 'origProp3' in each destination element.
<xform-copy select='root/from/child' dest='root/to'
attribute='origProp'/>
XML processor to move element content. Moves element content
to other parts of the XML tree. Behaves exactly like the copy processor,
except it deletes all the source nodes. If the destination selects more
than one node, the source nodes will be copied separately to each destination
node.
<xform-move select='root/from/child' dest='root/to'/>XML processor to move attribute content. Moves attributes to other elements in the XML tree. Behaves exactly like the copy processor, except it deletes all the source attributes. If the destination selects more than one attribute, the attributes will be copied separately to each destination node, with any necessary attribute name mangling.
<xform-move select='root/from/child' dest='root/to'
attribute='origProp'/>
XML processor to rename elements. Renames all selected elements
to the new name. The example below would rename all selected <child>
elements to <newChild>.
<xform-rename select='root/child' new-name='newChild'/>XML processor to rename attributes. Renames the specified attribute in all selected elements. The example below would rename all 'oldProp' attributes in the selected <child> elements to 'newProp'.
<xform-rename select='root/child'
attribute='oldProp' new-name='newProp'/>
XML processor to increase element nesting. Wraps the selected
elements with a newly created wrapper element. In the example below,
the processor would place all selected <child> elements
into <child-wrap> elements, without losing their place in the <root>
element. Thus, after the transform, the same <child> elements could
be selected with a query of 'root/child-wrap/child'.
<xform-wrap select='root/child' wrapper='child-wrap'/>XML processor to decrease element nesting. Removes all selected elements without deleting the child content of those elements. Essentially a non-recursive delete. The inline transform is the opposite of the wrap transform. All inlined content is inserted in place; if an inlined element has more than one child, all children will be inserted into the parent where the former inlined element was. This may offset the index counts of later elements. All attributes in the inlined elements are lost. Thus, the transform below would convert the sample input data into the sample output data below:
<xform-inline select='root/child'/>INPUT:
<root>
<child>
<grandchild1/>
<grandchild2/>
</child>
<other/>
<child>
<grandchild3/>
</child>
</root>
OUTPUT:
<root>
<grandchild1/>
<grandchild2/>
<other/>
<grandchild3/>
</root>
XML processor to convert attributes into elements. Converts
attributes into PCDATA elements. For each of the selected nodes, the
processor will move the requested attribute into a child element, placing
the content into PCDATA inside the element. In the example below, an
element "<child prop='value'/>" would become
"<child><prop>value</prop></child>". Elements without
the attribute will not be altered.
<xform-to-element select='root/child' attribute='prop'/>XML processor to convert element PCDATA content into attributes. The reverse of <xform-to-element>, this processor converts PCDATA elements into attributes on the parent element. First it extracts all PCDATA from all selected elements and appends it together into a single string, then assigns it to the named attribute. The entire content of all nodes becomes one attribute, and any attributes in the selected nodes are lost. The transform below would convert the input data into the output data shown below.
<xform-to-attribute select='root/child' attribute='prop'/>INPUT:
<root>
<child child-prop='child-prop-value'>CHILD1</child>
<notChild>NOT-CHILD</notChild>
<child>CHILD2</child>
</root>
OUTPUT:
<root prop='CHILD1CHILD2'>
<notChild>NOT-CHILD</notChild>
</root>
XML processor to change naming styles.
Recursively converts the selected elements and their attributes
into the requested naming style. Uses the same styles as the
<bean-to-xml> transform above. An optional select parameter
specifies which branches to convert; if the select query is omitted, the
processor will convert the entire tree.
<xform-style new-style='javabean' select='root/child'/>XML processor to collapse simple PCDATA elements into attributes. Recursively converts all elements which contain only PCDATA into attributes of the same name in the parent element. An optional select parameter specifies which branches to convert; if the select query is omitted, the processor will convert the entire tree. Elements which contain other element content will not be converted. Given the transform below, the XML "<root><child>CHILD-DATA</child></root>" would become "<root child='CHILD-DATA'/>".
<xform-style new-style='collapsed' select='root/child'/>XML processor to expand attributes into PCDATA elements. The reverse of the collapse transform. Given the transform below, the XML "<root child='CHILD-DATA'/>" would become "<root><child>CHILD-DATA</child></root>".
<xform-style new-style='expanded'
select='root/child'/>
User-defined processors. If the above processors aren't enough,
or if a custom processor would greatly simplify the transformation
process, you can implement a processor of your own and invoke it anywhere
in the transform. The processor must derive from CustomNodeProcessor. The <custom> processor XML itself is passed to
the custom processor, which can use any attributes or sub-elements inside
<custom> to initialize itself.
<custom class='org.mypackage.MyCustomProcessor' param='value1'>
<param value='value2'/>
<param value='value3'/>
</custom>
</transform>
| Constructor Summary | |
Transformer(org.writersforge.bellows.Datum transform)
Creates a new instance of Transformer, loading a collection of NodeProcessor implementations based on the specification in the supplied Datum tree. |
|
| Method Summary | |
java.lang.String[] |
getProcessorIds()
Retrieves an array of unique identifiers for all NodeProcessors in this Transformer. |
java.util.List |
process(java.lang.String[] ids,
java.util.List nodes)
Runs the List of input nodes through a set of processors, by id. |
java.util.List |
processAll(java.util.List nodes)
Runs the List of input nodes through every processor in the Transformer's XML specification, in order. |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
public Transformer(org.writersforge.bellows.Datum transform)
transform - the XML specification| Method Detail |
public java.util.List processAll(java.util.List nodes)
nodes - input data nodes
public java.lang.String[] getProcessorIds()
<transform> <replace old='oldtext1' new='newtext1'/> <replace id='textreplace' old='oldtext2' new='newtext2'/> <replace old='oldtext3' new='newtext3'/> <transform>
The returned ids would be:
If any processors share the same explicit id, the second and later duplicates will all be treated as if they had no explicit id. Thus, this XML specification:
<transform> <replace id='textreplace' old='oldtext1' new='newtext1'/> <replace id='textreplace' old='oldtext2' new='newtext2'/> <replace id='textreplace' old='oldtext3' new='newtext3'/> <transform>
would result in the following ids:
public java.util.List process(java.lang.String[] ids,
java.util.List nodes)
ids array, even if that means the same id
is run more than once.
ids - the ids of the transforms to runnodes - input data nodes
java.lang.IllegalArgumentException - if an id in ids does not
exist in the XML specification
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||