Sunday, December 16, 2012

Caching in Teiid

Teiid has several ways to improve performance by caching the data. The simplest approach is to provide a caching hint right in the SQL statement. The query result below will be cached for 1 minute, and stored preferably in memory. The subsequent selects using the same query return the cached results for 1 minute. It is important that query should be "string-equal" to retrieve the cached result. Teiid does not parse the query for performance considerations, and even a single extra space added to the query can force Teiid to get for a fresh result instead of cache. Also the same behavior happens if functions like now() used - the results are not cached.

 /*+ cache(pref_mem ttl:60000) */select id, value from test.table1;  
  id value  
  --- -----  
  1 80  
  2 20  
 update test.table1 set value = 10 where id = 2;  
 /*+ cache(pref_mem ttl:60000) */select id, value from test.table1;  
  id value  
  --- -----  
  1 80  
  2 20  
 select id, value from test.table1;  
  id value  
  --- -----  
  1 80  
  2 10  

Another approach is to use materialized views, or (simpler approach) a "lookup" function. This function useful to create a system-wide cache on some dictionary and perform a search. The results are cached until Teiid restarted, so it might not be convenient to keep a changing data.

 select lookup('test.table1', 'value', 'id', 1);  
 update test.table1 set value = 50 where id = 1;  
 select lookup('test.table1', 'value', 'id', 1);  

And the most flexible way to operate a cache is to use a custom translator with support of Caching API.
This is the overridden method from CacheTestExecutionFactory which defines the caching approach for translator:

   public CacheDirective getCacheDirective(Command command,  
       ExecutionContext executionContext, RuntimeMetadata metadata)  
       throws TranslatorException {  
     CacheDirective cd = new CacheDirective();  
     return cd;  

For complete source code of CacheTestExecutionFactory teiid translator see caching-api project on GitHub ( ). Also the Teiid user forum has a related discussion thread.

Sunday, September 30, 2012

Free Private Git Hosting

Today I have spent some time looking for Git hosting for my new pet project. Last time I did such kind of research about three years ago. I was looking for SVN hosting, and I ended up with Unfuddle.
Currently they do not offer a free option any more, and my interests are shifted to Git. My current requirements are very simple:
  • functional free starter option, with no-hassle switch to payed services when needed;
  • both ssh and https access to repository;
  • tracking/wiki option (low priority);
  • private hosting - no requirement to open-source your code like on github;
This is my shortlist:
I had to drop both ProjectLocker and  Springloops because of no https support for Git (only ssh is available). I open accounts on both Assembla and CloudForge. Both of them do not have Trac or any other bugtracking/wiki option for free, but it is not critical on this stage. Registration on Assembla was smooth, in a couple of minutes I committed and pushed a sample project and was able to see it from web ui. Another system (CloudForge) had some glitch during registration: the confirmation email came into my mailbox with about 15-hour delay. During this period the admin module has been annoying me about unfinished registration every time I logged in, asking to provide a confirmation number in a special input field (otherwise my account will be disabled in 24 hours). When email has finally arrived, there was no number, only a registration link to click.
I decided to stay with Assembla for now, but keep an eye on CloudForge in the same time. I like their Publisher option - ability to configure your deployment for Joyent, Amazon WS, Salesforce, Google App Engine, CloudFoundry of custom FTP or SSH option. I had not tried it yet, but from description in sounds useful and time-saving.

Update (next day): I've tried BitBucket this morning, and I think it fits pretty well into all my requirements: free, private, https, wiki/issues tracking, web access, integration with other systems. It allows to get more than one user account to the repository, which also might be useful. I'll stay with BitBucket for now.

Wednesday, September 12, 2012

Teiid and HornetQ Integration

In my previous article "Simpe JMS Provider on JBoss Teiid" you can learn about how to create a Teiid translator which will "convert" your SQL commands into messages and put them into queue. This article talks about opposite approach: how to retrieve messages from queue using SQL SELECT statements, so our integration pattern will be complete. The schema below explains the integration logic.
Our goal is to pass a specific command (CALL... or INSERT...) to Teiid with some parameters. This parameters will be converted in Teiid by our custom translator (JMS Producer) into the message, which will be immediately placed into queue. New player is another custom translator (JMS Consumer). Client can issue a Continuous Execution to Teiid, which will automatically repeat itself with some time interval. Each round of execution will retrieve messages from queue and pass back to client as execution results. Client gets notified by Teiid when result is available - client obtains a callback object from Teiid when the execution initiated. The StatementCallback.onRow() method gets executed when data is passed back to client.
The process can be further improved, if instead of retrieving messages periodically we will listen queue and retrieve message back to client right after it will appear. So instead of polling we will implement pushing approach.

Implementation Details

The custom translator code for JMS Consumer should implement MessageListener interface. The provided onMessage(Message msg) method will be called each time the message will appear on subscribed queue. Translator also implements ResultSetExecution (so it will appear as a view in Teiid Virtual Database), and ReusableExecution interfaces. The DataNotAvailableException.NO_POLLING will be thrown from the translator next() method, to put translator "into sleep". This is a non-blocking process and part of Continuous Execution functionality. To bring translator back to life, we will call ExecutionContext.dataAvailable() from onMessage() method.
The complete code of JMS Consumer Translator ( is attached, and below are the most important methods:
      public void execute() throws TranslatorException {  
           this.callNext = DataNotAvailableException.NO_POLLING;  
      public List<?> next() throws TranslatorException, DataNotAvailableException {  
           if (callNext != null) {  
                DataNotAvailableException e = callNext;  
                callNext = null;  
                throw e;  
           if (msgRow != null) {  
                List<Object> results = msgRow;  
                msgRow = null;  
                return results;  
           return null;  
      public void dispose() {  
           try {  
           } catch (JMSException e) {  
                // TODO Auto-generated catch block  
      public void onMessage(Message msg) {  
           try {  
                String messageContent;  
                if (msg instanceof TextMessage) {  
                     messageContent = ((TextMessage) msg).getText();  
                } else {  
                     messageContent = msg.toString();  
                System.out.println("!!!!!! MSG:" + messageContent);  
           } catch (JMSException jmse) {  

To make sure it works I used a JBoss with HornetQ configuration from the previous article, and made a sample java servlet which connects to Teiid and executes a SELECT statement continuously. Also I have another SQL client application (SQuirreL or Eclipse Database Explorer) connected to Teiid, which I use to produce messages (CALL JMS Producer, like in my previous article).
Testing approach:
  • start JBoss/HornetQ/Teiid with deployed translator jar module and war file with servlet;
  • open browser, call servlet to issue a continuous execution;
  • run call statement from SQL client to send a message;
  • check servlet logs that message successfully retrieved by translator and passed back to servlet;
 MessageReceiveExecution (Google Docs).

Monday, September 3, 2012

Simple JMS Producer on JBoss Teiid

With it's Continuous Execution feature, new Teiid 8.1 came closer to functionality which reminds me SQL Server Service Broker. In particular, it might be possible to write Teiid extensions which can serve as integration points between relational data from VDB and asynchronous data available in message-oriented middleware.
For example, Teiid translator can be a JMS client which can send messages into some queue. Another Teiid translator running in Continuous Execution mode can retrieve messages from queue.
Below is results of my experiments with first approach: to have a stored procedure which will act as JMS producer.

Configure JMS in JBoss

The  JBoss 7.1.1 has built-in JMS support with HornetQ, so it is only a matter of properly editing config files. The HornetQ stuff included in standalone-full.xml profile, and Teiid configuration is in standalone-teiid.xml, so I took simplest approach and merged all JMS-related parts from -full.xml into *-teiid.xml one. Also I defined a test queue "MyQueue":
      <address-setting match="jms.queue.MyQueue">  
    <jms-queue name="MyQueue">  
      <entry name="java:jboss/exported/MyQueue"/>  

It is worth to write a small producer and consumer java classes to test the queue and make sure all configured correctly. The JBoss Management console and this post might be helpful.

Write Teiid Translator

I made a simple translator which is a stored procedure, implements the ProcedureExecution and acts as a JMS producer. Queue name and message passed as stored procedure IN parameters.

 package com.rokhmanov.test.teiid.translator.async;  
 import java.util.ArrayList;  
 import java.util.List;  
 import javax.jms.JMSException;  
 import javax.jms.Queue;  
 import javax.jms.QueueConnection;  
 import javax.jms.QueueConnectionFactory;  
 import javax.jms.QueueSender;  
 import javax.jms.QueueSession;  
 import javax.jms.Session;  
 import javax.jms.TextMessage;  
 import javax.naming.Context;  
 import javax.naming.InitialContext;  
 import javax.naming.NamingException;  
 import org.teiid.language.Argument;  
 import org.teiid.language.Call;  
 import org.teiid.translator.DataNotAvailableException;  
 import org.teiid.translator.ExecutionContext;  
 import org.teiid.translator.ProcedureExecution;  
 import org.teiid.translator.TranslatorException;  
 public class MessageSendExecution implements ProcedureExecution {  
      public final static String CNN_FACTORY="/ConnectionFactory";  
      private String queueName;  
      private String messageContent;  
      private Call cmd;  
      private QueueConnectionFactory qconFactory;  
      private QueueConnection qcon;  
      private QueueSession qsession;  
      private QueueSender qsender;  
      private Queue queue;  
      private TextMessage msg;  
      public MessageSendExecution(Call command,  
                ExecutionContext executionContext, Object connection) {  
           this.cmd = command;  
     final List<Argument> procArguments = new ArrayList<Argument>(this.cmd.getArguments());  
     this.queueName = procArguments.get(0).getArgumentValue().getValue().toString();  
     this.messageContent = procArguments.get(1).getArgumentValue().getValue().toString();   
      public List<?> next() throws TranslatorException, DataNotAvailableException {  
           return null;  
      public void close() {  
           // TODO Auto-generated method stub  
      public void cancel() throws TranslatorException {  
           try {  
           } catch (JMSException e) {  
                throw new TranslatorException(e);  
      public void execute() throws TranslatorException {            
           InitialContext ic = getInitialContext();  
           initJMS(ic, queueName);  
      public List<?> getOutputParameterValues() throws TranslatorException {  
           return null;  
      private InitialContext getInitialContext() throws TranslatorException  
           try {  
                return new InitialContext();  
           } catch (NamingException e) {  
                throw new TranslatorException(e);  
      public void initJMS(Context ctx, String queueName) throws TranslatorException  
           try {  
                qconFactory = (QueueConnectionFactory) ctx.lookup(CNN_FACTORY);  
                qcon = qconFactory.createQueueConnection();  
                qsession = qcon.createQueueSession(false, Session.AUTO_ACKNOWLEDGE);  
                queue = (Queue) ctx.lookup(queueName);  
                qsender = qsession.createSender(queue);  
                msg = qsession.createTextMessage();  
           } catch (NamingException e) {  
                throw new TranslatorException(e);  
           } catch (JMSException e) {  
                throw new TranslatorException(e);  
      private void sendMsg(String userMessage) throws TranslatorException  
           try {  
           } catch (JMSException e) {  
                throw new TranslatorException(e);  

This is the important parts from the Execution Factory:

      public ProcedureExecution createProcedureExecution(Call command,  
                ExecutionContext executionContext, RuntimeMetadata metadata,  
                Object connection) throws TranslatorException   
           return new MessageSendExecution(command, executionContext, connection);  
      public void getMetadata(MetadataFactory metadataFactory, Object conn)  
                throws TranslatorException   
     final Procedure messageSender = metadataFactory.addProcedure("messageSender");  
     metadataFactory.addProcedureParameter("queue", TypeFacility.RUNTIME_NAMES.STRING, Type.In, messageSender);  
     metadataFactory.addProcedureParameter("message", TypeFacility.RUNTIME_NAMES.STRING, Type.In, messageSender);  

To post the message, use this SQL statement:

 call Times.messageSender('java:jboss/exported/MyQueue', 'My Message');  

Then monitor JBoss Management console or use your sample java JMS client to retrieve a messsage from queue:

 Sep 03, 2012 10:34:45 PM org.xnio.Xnio <clinit>  
 INFO: XNIO Version 3.0.3.GA  
 Sep 03, 2012 10:34:45 PM org.xnio.nio.NioXnio <clinit>  
 INFO: XNIO NIO Implementation Version 3.0.3.GA  
 Sep 03, 2012 10:34:45 PM org.jboss.remoting3.EndpointImpl <clinit>  
 INFO: JBoss Remoting version 3.2.3.GA  
 JMS Ready To Receive Messages (To quit, send a "quit" message from QueueSender.class).  
       My Message  

Note: this translator code is not optimized for performance or thread-safety, I wrote it just for illustration purposes. Areas of potential improvement:
  • implement UpdateExecution to make a translator handle INSERT SQL statement (more intuitive approach);
  • reuse connection to Queue;

Friday, August 17, 2012

Embedded Teiid

When developing a new Teiid translators, I found very convenient to use a new 8.1 feature: Embedded Teiid. Basically this is a working Teiid server, which does not require JBoss AS as a host platform. Of course some functionality provided by Application Server has to be added manually (like transactions, clustering or data connection pooling), but in many cases it is unnecessary, especially when you need just a simple federated database or have to quickly test your new translator.

Basic functionality can be unit-tested with frameworks like Mockito. But only integration test can give you a full picture. For integration tests I have to run a JBoss and Teiid instance as Cargo container with translator jars deployed - basically a mini-scaled copy of our application. This is a fragile and slow way - you have to emulate the environment on which your application is running, provide a proper configs and data sources, make sure the integration tests executed as part of your local build. Better if you have a flexibility of unit tests and completeness of integration tests combined together.

Below is a sample code how to instantiate an Embedded server:
 EmbeddedConfiguration ec = new EmbeddedConfiguration();  
 server = new EmbeddedServer();  
 final JdbcDataSource h2ds = new JdbcDataSource();  
 EmbeddedServer.ConnectionFactoryProvider<DataSource> jdbcProvider =   
 new EmbeddedServer.ConnectionFactoryProvider<DataSource>()   
      public DataSource getConnectionFactory() throws TranslatorException   
           return h2ds;  
 server.addConnectionFactoryProvider("source-jdbc", jdbcProvider);  
 server.addTranslator(new H2ExecutionFactory());  
 ModelMetaData jdbcModel = new ModelMetaData();  
 jdbcModel.addSourceMapping("h2-connector", "h2", "source-jdbc");       
 teiidServer.deployVDB("example", jdbcModel);  

Create server, define model and data sources, deploy VDB - in a less then twenty lines of code. Now you can connect and run your queries:
 TeiidDriver td = server.getDriver();  
 Connection c = td.connect("jdbc:teiid:example", null);  
 final String sql = "select * from Product";            
 List<String> results = execute(c, sql, true);  

Really, cannot be simpler. The complete listing of unit-test class attached for reference. Also take a look on "embedded-portfolio" example from Teiid 8.1 sources, my post was greatly inspired by it.

Tuesday, August 7, 2012

My Experience with OrientDB JDBC

I did some graph DB research and POC about six month ago for a "configuration management" task. The short description: the system might have a different set of modules, each consist from a bigger set of different parts. Almost all parts has different types of relations inside one module and across them. The task is to easily store this data, and have some complex search and retrieval logic. A typical graph database task. My weapon of choice at that point was OrientDB.
My first results were quite good - I was able to quickly import data from our Teiid relational federated database and build a graph with 322591 vertices and 322636 edges. The complete load process took about 40 minutes, mostly because slowness on our side. Our federated database consist from many different parts: fast and small H2 tables, custom data translators to a legacy data, and large and slow DB2 tables with hundreds of thousand records on mainframe.
I used both Native and Tinkerpop API and found it useful and easy to use. In my POC I took Jung for graph visualization, and for a collection of graph algorithms.
Here are just some results for the graph above:
  • "Find Shortest Path" (Dijkstra): 4781.0 ms
  • "Check Nodes Connected" using Dijkstra algo: 7392.0 ms
  • "Check Nodes Connected" using SQL TRAVERSE extension from OrientDB: 23829.0
Note that results were obtained from in-memory graph, with no optimization (raw quality java code, no indexes defined in OrientDB, unlimited walk depth in traversal). I am confident the results can be improved very easily.
My next step was the integration of OrientDB into Teiid. Since we have a huge federated database, the way to use SQL to send and retrieve data from graphDB considered as convenient but somewhat unnatural approach. I can explain benefits and drawbacks of using this approach separately, it deserves a whole new post. Unfortunately that time (6 months ago) I had to step back: the JDBC implementation of OrientDB was very raw and did not provide enough metadata for Teiid 7.6 to be included in federated database.
I got back to this task recently, trying a new Teiid 8.1 (it has a useful notion of defining metadata in DDL along with NATIVE support like the old Teiid) and with latest OrientDB 1.2.0-SNAPSHOT. I took the latest orientdb JDBC driver sources from git and compiled jar file by myself. Taking the latest JDBC sources was probably my biggest mistake. I've tried to connect to running OrientDB instance using Eclipse Database Explorer and run some queries. Unfortunately even simple and innocent queries against my imported graph or even against built-in "tinkerpop" database provided inconsistent results and even caused Eclipse to freeze sometime:

select * from V where o_id like 'demodata.s%';
select from 6:6 where any() traverse(0,10) (o_id = 'demodata.stats');

Something definitely wrong in the latest JDBC driver code. Obviously, the JDBC driver is not a top priority feature for a graph database. I probably can implement the needed functionality by myself - write a custom Teiid translator (might be a considerably big task, depends of the level of SQL support we want to achieve), or try another graphDB implementations. I will be looking also on neo4j - they also have a jdbc driver. The query language they support in JDBC is not SQL, but (I think) it can be worked out.
This task is on hold for now anyway, I'll make an update in two-three months.

Thursday, August 2, 2012

Poll vs Push in Database using JBoss Teiid

Sometimes you have to monitor a specific table or view in your database, and do some action when INSERT or UPDATE happens. There are two approaches - you can periodically poll the table using Quartz or any other scheduling tool. Or you can be notified by database itself when your table modified - database will push you the new data.
The second approach even more beneficial when your table get updates rarely (so you not waste CPU by returning empty results most of the time), or when you want to be notified immediately when the new data available.

The good solution to the task above on my mind is database trigger. The commercial database vendors (like Oracle, if I am not mistaken) provide a triggers which (when fired up) are capable to run an arbitrary java code. But what if you don't have Oracle, and the data source of your choice is mySQL, Apache Hive, H2, or even arbitrary CSV file, laying down somewhere on filesystem?

Here is my approach: I made an UPDATABLE VIEW for my table in H2 database under JBoss Teiid, then created a VIRTUAL FUNCTION (Teiid virtual functions are capable to run specified java class and method). Last step - define two triggers: INSTEAD OF UPDATE and INSTEAD OF INSERT. When my table in H2 got a new record, or existing one updated - my java class gets immediately executed by Teiid.

Implementation Details:
  • Below is the Teiid 8.1 model definition with VIEW, VIRTUAL FUNCTION and two TRIGGERs defined. Note a property "lib" - it is a name of JBoss 7 Module with your java class in it (yes, you have to create a JBoss module. It is a simple task and well documented).
 <vdb name="DynamicPortfolio" version="1">  
   <property name="lib" value="com.test.teiid.translator.async" />  
   <model name="TrigView" type="VIRTUAL">  
      <metadata type="DDL"><![CDATA[  
             CREATE VIRTUAL FUNCTION RepHealth(  
                     healthTime varchar,  
                     policyKey varchar,  
                     objKey varchar,  
                     healthState varchar)   
                RETURNS integer   
                OPTIONS (JAVA_CLASS 'com.test.teiid.udf.HealthStatesTeiidFunction',   
                     JAVA_METHOD 'reportHealthState');  
                as select * from Accounts.HEALTHSTATE;  
                AS FOR EACH ROW  
                BEGIN ATOMIC  
                     SELECT RepHealth(  
                          New.HEALTHTIME, New.POLICYKEY, New.OBJKEY, New.HEALTHSTATE)   
                     from HS_VIEW;       
                AS FOR EACH ROW  
                BEGIN ATOMIC  
                     SELECT RepHealth(  
                          New.HEALTHTIME, New.POLICYKEY, New.OBJKEY, New.HEALTHSTATE)   
                     from HS_VIEW;       

  • The VDB changes above should work with Teiid's "dynamicvdb-portfolio" example, I installed this example on my Teiid in advance and created schema for additional "Accounts.HEALTHSTATE" table in H2. The DDL is below. You can add a new model right into portfolio-vdb.xml from the example.
   HS_ID integer,  
   HEALTHTIME varchar(8),  
   POLICYKEY varchar(8),  
   OBJKEY varchar(8),  
   HEALTHSTATE varchar(8),  
  • Write your java class and package it as JBoss Module. The called java method should be static.
 package com.test.teiid.udf;  
 public class HealthStatesTeiidFunction {  
      public static int reportHealthState(String healthTime, String policyKey, String objKey, String healthState)  
           String ret = "At:" + healthTime + " for Policy:" + policyKey + " and Object:" + objKey + " the State is:" + healthState;  
           return 1;  

  • Start JBoss, make sure your module and VDB are deployed and active. Connect to your database and issue INSERT or UPDATE statement against your defined view.
 values(4, '10:28 AM', 'policy1', 'obj1', 'RED');   

  • See the java output in console:
 13:51:18,336 INFO [stdout] (Worker0_QueryProcessorQueue0) At:10:28 AM for Policy:policy1 and Object:obj1 the State is:RED  

Note that since triggers defined as "INSTEAD OF..", the actual insert or update does not happen. Also, in order to be notified by the database changes, your underlying system should issue inserts or updates not to the actual H2 table, but to Teiid UPDATEABLE VIEW we defined (HS_VIEW in example above). Teiid itself is a federated database, available by JDBC, but connecting to it might be an added complexity to your solution.

Wednesday, August 1, 2012

JBoss 7 and Maven Plugin

This days I play a lot with the latest version of JBoss Teiid (8.1.Beta2). As part of my development activity I have JBoss 7.1.1.Final running on my workstation, and I want to shorten my development cycle: modify code -> build war -> deploy. Yes, I know about JRebel, but today I want not to use it.
My project build tool is Maven, so solution is straight: use jboss-as-maven-plugin. I want to generate war file from maven and deploy it to standalone JBoss without restarting it.

Steps to follow

  • Define plugin in pom.xml. The definition below attaches to "clean" and "package" maven life cycles.
                                <configuration>                                           <ignoreMissingDeployment>


  • Optionally - create Eclipse launch configuration to execute "mvn package" or "mvn clean package".
  • Start JBoss, run launch configuration in Eclispe, check messages in your server log that new context is regustered and war file is deployed / replaced.


 What happens behind the scenes?

Usually deployed war files placed in <jboss7>/standalone/deployments folder (when standalone configuration used). After JBoss restart the war file gets deployed (new "xxxx.deployed" flag file gets created in the same folder).
In case of "jboss-as-maven-plugin" deployment, the <jboss7>/standalone/configuration/standalone.xml file gets modified: a new "<deployment>...</deployment>" added to it.
The actual deployed war file can be found now under one of the <jboss7>/standalone/data/content/ subfolders.