Wednesday, December 5, 2012

Oracle Endeca Information Discovery – Partition Transformation


I have been working with Oracle Endeca Information Discovery tool to join structured and unstructured data together in interesting business solutions. I wanted to share my experience with the Partition Transformation and help show how it can be used to conditionally split a pipeline in an Intergrator (CloverETL) graph.


The transformation has properties called Ranges and Partition key. Setting the Partition key is easy. Just pick the fields from a list. Setting the Ranges is not so easy to understand.


See the table below that shows an example of setting the Ranges property. Setting ranges requires using an awkward syntax that uses < and > to include and ( and ) to exclude. A comma defines the low end and high end of the range. A semi-colon defines separate ranges. Sounds confusing? It is. Fortunately, there is a much easier way using CTL2.


For the easy way, click on the Partition attribute that drops you into the CTL2 editor. You can refer to your input fields with a $ prefix. $DOC_TYPE was defined on the import port and is an input field. Then write a conditional statement. In my example below, all rows with a $DOC_TYPE = “Operation Instructions” will be sent to port 0. All others will be sent to port 1.