Monday, December 23, 2013

Automate JBoss Teiid Development Environment with Vagrant and VirtualBox

Why To Automate?

What benefits has the development environment automation:
  • Ability to run several development versions in parallel. Each running environment can host a separate branch of the same product. Or keep some unique environment options, like different JVMs (Pentaho Mondrian is a good example);
  • Some complex project which require simultaneous development in several places (happens rare, so it is arguable benefit);
  • Freedom to experiment with your environment. You can always install something, rollback to a stable step, or start from scratch in minutes;
  •  Your development environment configuration now is a code itself. You can debug it, branch, release, or write tests;
  • You can migrate your environment to a new machine or upload it to a cloud in several mouse clicks. You just need to find a way to VNC or RDP to it. Finally you can do an Eclipse development from your android phone! (yeah, right...)
  • If you work for a company, spawn a new environment for new developers would take hours, not days or weeks;
  • Environment upgrades and updates for everyone in your team are centralized, tested and predictable. No more "it works on my side". 
Along with benefits above, here are the things to remember:
  • Your Host machine should have enough RAM and CPU. Maven can build in multiple threads, GWT compile permutations over all available cores. So obviously more you have, and more you can slice for your environments - better they work;
  • Commercial vs Free. If you develop on Windows or Mac - each generated environment should be registered, which means extra $$ and extra complexity;
  • You need good internet connection to generate the environment (can be several GB to download). On opposite side, most of the time you need to do it only once, the large downloads are cached locally;
  • Writing environment automation scripts can be a slow job. Depends of your goals the environment generation process might take from several minutes to several hours, and some stupid typo you carelessly made in the last step would require to start over;
  • You still have to install and get familiar with some (not very complex) tools which help you with the automation;
All this pros and cons are individual. If you think that in your case it worth attention, below is my recipe.

Instant Eclipse Development Environment

My goal is to have a development environment to work on data federation projects with JBoss Teiid. Below is the list of software I put to it, but your task might be different. Due to modularity, you can exclude some items or add new things very quickly.
  • Linux OS. I took Ubuntu 12.04.03 LTS, it is wide-known, considerably stable and supported by large community. No hassle or extra money for licenses.
  • Oracle Java 7 SDK. Installation requires to accept the license agreement, but it can be automated;
  • Apache Maven 3.0.4. I need specifically this version, but script can be easily changed to the most recent;
  • Git as a source control. Just use the one which Ubuntu has in its repository;
  • Eclipse Kepler SR1 for JEE. I need specifically this version, change several lines in script to put another one;
  • JBoss Application Server. I put EAP-6.1.0.Alpha, this is the one which works best with the latest Teiid;
  • Teiid 8.6.0.Final distribution, download and install on JBoss;
  • Latest Teiid 8.7.0.Alpha1 sources;
As a result, you're getting a pretty decent development environment (8 GB RAM, 2 CPU, 32GB storage), which you can immediately fire up and start working on your own project, or extend Teiid itself.
Steps to follow to generate your own environment using the template above:
  1. Get the latest version of Oracle VM VirtualBox. I used 4.3.6. It is free and easy to use virtualization;
  2. Install latest Vagrant. I have version 1.3.5. This is a tool which manages the collection of your virtual environments, takes care about network access and describes how you want to alter the original OS image for your purposes;
  3. Get the latest Packer. Version I used is 0.4.1. This is a nice functional addition to Vagrant, which allows you to get and apply the customizations and easy install additional software on generated box;
  4. Get the packer-teiid template I made for the environment described above. It is hosted on GitHub, so clone or simply download it to some folder;
  5. Go to the packer-template folder and run "packer build ubuntu-12.04.3-desktop-amd64.json". This will start the generation process. After some time (~40 minutes or so), you should see a shiny new "ubuntu-12.04.3-desktop-amd64.box" file in the same folder, around 2GB size;
  6. Now it is a time to add your new box to a Vagrant collection. Type "vagrant box add ubuntu-12.04.3-desktop-amd64 ubuntu-12.04.3-desktop-amd64.box" in the packer-teiid folder, or specify a full path to your *.box file if you do it outside;
  7. You can bring your environment up now. If you want to do it in the same folder, run "vagrant up". If you want to do it from other place, you should do "vagrant init" first in this new folder. The generated "Vagrantfile" also included in my project, compare it with the one which appear in your new folder and see the differences ("vb.gui = true" is the most important one, otherwise your environment will start in background).
  8. If everything went well, you should see something line on image below. Feel free to login using vagrant/vagrant credentials, sudo and change the root password, run Eclipse or open JBoss bin folder using a shortcuts on a desktop.
  9. You can get to your running environment from Host using ssh (port 2222 on localhost). To transfer files between Host and Guest, you can use "/vagrant" mapping from inside your Guest. It is mapped to the actual folder on Host machine from where you started your VM box ("packer-teiid" in my case).
  10. To stop your environment, simply run "vagrant halt" or "vagrant suspend". To kill it - "vagrant destroy".
  11. Happy Packing!

Screenshot of Ubuntu Linux Development Environment running under Windows 7 host machine:

Friday, December 6, 2013

Teiid Translator for EBay

I wrote a very basic translator for EBay Finding API. Implemented only findByKeyword API call, but this seems enough to issue a query against EBay database of active auctions and retrieve result.

Link to Teiid EBay translator on GitHub.

Prerequisites

You should have an active EBay Develpers License, which you can get for free at http://developer.ebay.com. Also you have to download an actual EBay FindingKitJava archive from EBay Developer website, and extract the finding.jar from lib folder. I used version 1.12.0 of finding API jar (Built-Date: 2011-04-28 14:10:35), let me know if you have problems with other versions. For convenience you can put this jar into your local maven repository, see the "version.ebay-finding-java-driver" parameter in parent pom.xml for the reference.
Put your API key to EbayTest.java (see DEVELOPER_KEY static field) and run this class as a unit test to make sure your API key is working.

If everything is fine, you should get a similar result:
Returned results:[
|271336881138|Harry Potter and the Order of the Phoenix  (Xbox 360, 2007)|EBAY-US|null|Video Games||http://thumbs3.ebaystatic.com/m/m49aytufjFn4dN70RkI4QMA/140.jpg|http://www.ebay.com/itm/Harry-Potter-and-Order-Phoenix-Xbox-360-2007-/271336881138?pt=Video_Games_Games|null|56274325|PayPal|false|91911|Chula Vista,CA,USA|US||||Active|Auction|false||null||Good|null, 
|171183853715|Harry Potter and the Order of the Phoenix Figures Harry, Hermione, Ron and Map|EBAY-US|null|TV, Movie & Video Games||http://thumbs4.ebaystatic.com/m/maptT7UNML82z5uv64Abg2A/140.jpg|http://www.ebay.com/itm/Harry-Potter-and-Order-Phoenix-Figures-Harry-Hermione-Ron-and-Map-/171183853715?pt=US_Action_Figures|null|152040891|PayPal|false|34119|Naples,FL,USA|US|||9.8 USD|Active|Auction|false||null||New|null
]

Implementation

  • translator-ebay - the actual Teiid translator code;
  • ebay-api- a wrapper for EBayConnection interface;
  • connector-ebay - a JBoss resource adapter. It supposed to get a EBay Developer key as a configuration parameter if translator will be executed in real (non-embedded) Teiid instance.
The EBay Teiid translator is a simple stored procedure, which accepts a keywords separated by space as a single input parameter. The stored procedure input parameter name is obviously "keywords".

Example:
call ebaydata.findByKeyword('harry potter phoenix');

Output resultset has several important fields provided by Ebay Finding API:
  • "itemId"
  • "title"
  • "globalId"
  • "subtitle"
  • "primaryCategory"
  • "secondaryCategory"
  • "galleryURL"
  • "viewItemURL"
  • "charityId"
  • "productId"
  • "paymentMethod"
  • "autoPay"
  • "postalCode"
  • "location"
  • "country"
  • "storeInfo"
  • "sellerInfo"
  • "shippingInfo"
  • "sellingStatus"
  • "listingInfo"
  • "returnsAccepted"
  • "galleryPlusPictureURL"
  • "compatibility"
  • "distance"
  • "condition"
  • "delimiter"
The underlying EBay Finding API is a SOAP webservice, see the FindingService.wsdl from downloaded FindingKitJava.zip for description of each of the returned fields.
 

Friday, October 4, 2013

Bring HTML Pages Into Relational World Using Web Scraping Teiid Translator

You probably heard many times that we live in the era of Semantic Web. Unfortunately not all HTML pages you see were made using RDF. We have to parse them using web browsers, HTTP clients and a variety of custom tools. Many HTML pages are old, unstructured, were built using outdated standards or poorly designed instruments.
Would be really nice to retrieve all data kept in HTML with minimal effort, and be able to access it in relational way. I had a sleepless night last week, and that's what I came up with.

In short - this is a poor attempt to wrap a great Jsoup java HTML parser in Teiid translator logic. A single example is better than a hundred words. This SQL statement:

SELECT text, attributes
FROM (call scrapedata.scrap('http://www.bing.com/search?q=jboss+teiid','a[href]')) as S
WHERE upper(text) like '%TEIID%'

Returns this:
Teiid - JBoss Community - Community driven open source … , href="http://www.jboss.org/teiid" h="ID=SERP,5095.1"    
Teiid - Downloads - JBoss Community , href="https://www.jboss.org/teiid/downloads" h="ID=SERP,5108.1"
Teiid Download , href="/search?q=Teiid+Download&FORM=QSRE4" h="ID=SERP,5240.1"
Teiid Designer , href="/search?q=Teiid+Designer&FORM=QSRE5" h="ID=SERP,5241.1"
Teiid Forum , href="/search?q=Teiid+Forum&FORM=QSRE6" h="ID=SERP,5242.1"
Teiid - Tools - JBoss Community , href="https://www.jboss.org/teiid/tools" h="ID=SERP,5121.1"
Teiid Installation , Community - JBoss, href="https://community.jboss.org/wiki/TeiidInstallation" h="ID=SERP,5134.1"
Teiid - JBoss Issue Tracker , href="https://issues.jboss.org/browse/TEIID" h="ID=SERP,5147.1" 
Teiid 7.0 Installation Guide , href="https://community.jboss.org/wiki/Teiid70InstallationGuide" h="ID=SERP,5160.1", 
TEIID on tomcat - Community - JBoss, href="https://community.jboss.org/thread/205308?start=0&tstart=0" h="ID=SERP,5172.1"

Feel free to get a clone a translator-scrape repository from Github, check the sources, play with ScrapeTest.java - it is a unit test build with Embedded Teiid, should give you an idea of how to use this translator.

Friday, April 19, 2013

Running HornetQ Bridge Under JBoss 7

Here is the explanation of Bridge from the HornetQ documentation: "Some messaging systems allow isolated clusters or single nodes to be bridged together, typically over unreliable connections like a wide area network (WAN), or the internet. A bridge normally consumes from a queue on one server and forwards messages to another queue on a different server. Bridges cope with unreliable connections, automatically reconnecting when the connections becomes available again".

My goal is to setup a bridge which will forward messages between two queues residing on a different physical servers over the TCP connection. I plan to use two HornetQ services running as part of the latest JBoss (EAP 6.1.0). The "jms-bridge" example from standalone HornetQ distribution has a sample configuration (standalone-example.xml) for JBoss to setup a bridge on a single JBoss instance with HornetQ service. We will have to modify it to support two servers over the network.

In my configuration I have two systems set up, let's name them "laptop1" and "remoteHost", each of them has JBoss with HornetQ service installed. Below are the changes to the configuration from "jms-bridge" example:
  1. Define new HornetQ connector and acceptor. Note that HornetQ has "netty-connector" defined for JBoss configuration, but seems it does not support "host" and "port" parameters, and rely on mandatory "socket-binding" attribute (see jboss-as-messaging_1_3.xsd from JBoss distribution for more details). That's why I had to redefine both connector and acceptor. For simplicity the acceptor has "0.0.0.0" for host to allow connections from everywhere. Also note that on second server the ports should be listed as opposite - 5457 for acceptor and 5456 for connector.
  2. <connector name="netty-bridge">
        <factory-class>org.hornetq.core.remoting.impl.netty.NettyConnectorFactory</factory-class>
        <param key="host" value="remoteHost" />
        <param key="port" value="5457" />
    </connector>
    <acceptor name="netty-bridge">
        <factory-class>org.hornetq.core.remoting.impl.netty.NettyAcceptorFactory</factory-class>
        <param key="host" value="0.0.0.0" />
        <param key="port" value="5456" />
    </acceptor>
  3. Disable the HornetQ security, also for simplicity reasons. The proper approach will be to specify "user" and "password" as parameters for connector above.
  4. <hornetq-server>
            <security-enabled>false</security-enabled>
            ...
  5. Define two queues: "source" and "target". If desired, for the purposes of this example would be enough to have only one "source" query on local server and one "target" on remote. 
  6. <jms-queue name="sourceQueue">
        <entry name="queue/sourceQueue"/>
        <entry name="java:jboss/exported/jms/queues/sourceQueue"/>
    </jms-queue>
    <jms-queue name="targetQueue">
        <entry name="java:/queue/targetQueue"/>
         <entry name="java:jboss/exported/jms/queues/targetQueue"/>
    </jms-queue>
  7. Define a new connection factory to use in bridge.
  8. <connection-factory name="RemoteConnectionFactoryBridge">
        <connectors>
           <connector-ref connector-name="netty-bridge"/>
        </connectors>
        <entries>
           <entry name="RemoteConnectionFactoryBridge"/>
           <entry name="java:jboss/exported/jms/RemoteConnectionFactoryBridge"/>
        </entries>
    </connection-factory>
  9. Modify bridge definition added after "hornetq-system" tag in settings.xml.
  10. <jms-bridge name="myBridge">
        <source>
            <connection-factory name="ConnectionFactory" />
            <destination name="queue/sourceQueue" />
        </source>
        <target>
            <connection-factory name="RemoteConnectionFactoryBridge" />
            <destination name="queue/targetQueue" />
        </target>
        <quality-of-service>AT_MOST_ONCE</quality-of-service>
        <failure-retry-interval>1000</failure-retry-interval>
        <max-retries>7890</max-retries>
        <max-batch-size>1</max-batch-size>
        <max-batch-time>1000</max-batch-time>
    </jms-bridge>
After starting up both JBoss servers, watch for log message indicating that the bridge is started and connection between two HornetQ instances has been established:
18:54:55,621 INFO [org.hornetq.core.server] (MSC service thread 1-1) HQ221024: Started Netty Acceptor version 3.6.2.Final-c0d783c 0.0.0.0:5456 for CORE protocol
...
18:54:57,953 DEBUG [org.hornetq.core.client] (ServerService Thread Pool -- 61) Remote destination: rokan01-VM3762.ca.com/10.130.248.122:5457
...
18:54:58,723 INFO [org.jboss.messaging] (ServerService Thread Pool -- 61) JBAS011610: Started JMS Bridge myBridge
To test the configuration, send a sample message to a "sourceQueue" using for example Teiid messageSender from one of my previous articles:
call Times.messageSender('queue/sourceQueue', 'My Message1'); 
Now login to JBoss Management Console on your remoteHost (http://remoteHost:9990/console/App.html#jms-metrics) and make sure the messages are gets delivered over the bridge to remote host "targetQueue":


There might be a moment when you have only a "connector" side of the bridge up and running. The bridge will automatically reattempt to instantiate the connection every second (configurable by "failure-retry-interval" setting in bridge definition above), until the "acceptor" side of the bridge will be available.