Data Integration Suite – Process XML Using MySQL

A partner company XYZ sends you book orders using their proprietary format:

    <order>
        <book bookid="1" quantity="5">
            <isbn>0596002696</isbn>
        </book>
    </order>

Our company's canonical XML format relies on the following structure, and it exposes the following information:

    <order>
        <book>
            <title>Java Web Services</title>
            <publisher>O'Reilly Media, Inc.</publisher>
            <publishing-date>06 March, 2002</publishing-date>
            <quantity>5</quantity>
            <ISBN>0596002696</ISBN>
        </book>
    </order>

While the basic format of the order XML and canonical XML is similar, there are a few differences that we'll need to address before the order can be processed. While the structural hierarchy of the XML is the same (both have order, book, and ISBN elements) the order XML uses lowercase for the ISBN element name, and it includes quantity as an attribute of the book element. Further, the canonical XML includes several other elements not present in the order XML, like those for title, publisher, and publishing date.

The following XQuery takes care of the simple format changes we need to implement. Note that we have decided to reference the incoming order as an external variable so that we can bind the right content dynamically as part of our application:

    declare variable $order as document-node(element(*, xs:untyped))
    external;
    <order> {
         for $book in $order/order/book
         return
              <book>
                   <quantity>{$book/@quantity}</quantity>
                   <ISBN>{$book/isbn/text()}</ISBN>
              </book>
    } </order>

Now that the formatting discrepancies have been addressed, how do we add the missing information about title, publisher and publishing date?

Our company relies on MySQL database to keep detailed information about all the books available in inventory. In particular, the "books_dbo" schemata (in MySQL terms) contains a "booksxml" table that contains all the details about the books in inventory that we need. Let's look at this table in the MySQL Query Browser:

Click to Enlarge

Data Integration Suite is able to access MySQL schemata and tables. And it’s able to do that in a performant and scalable way, as described in more details in a DataDirect white paper – The Data Integration Suite Performance: Generating SQL. For example, you can access data from the “booksxml” table for a specific ISBN number this way:

    for $booksxml in collection("booksxml")/booksxml
    where $booksxml/isbn = "0596002696"
    return $booksxml

In this case, the result of this XQuery is:

    <booksxml>
            <isbn>0596002696</isbn>
            <title>Example Scenario</title>
            <manufacturer>O'Reilly Media, Inc</manufacturer>
            <releaseDate>06 March, 2002</releaseDate>
    </booksxml>

This result is the XML representation of one specific row in the MySQL "booksxml" database table. If you are curious about how Data Integration Suite is retrieving that result, take a look at the query plan that Data Integration Suite is actually executing:

As you can see, the selection of the matching ISBN number is entirely pushed as SQL query to the database; the Data Integration Suite engine “only” does the job of rendering the returned information as XML.

In our case, however, we want to do something a bit more complicated: We want to join data available in the incoming XML order with data available in the MySQL database, and we want to return an XML document consistent with our canonical format. Merging together the two XQueries described above, we can accomplish this quite easily:

    declare variable $order as document-node(element(*, xs:untyped))
    external;
    <order> {
         for $book in $order/order/book,
             $details in collection("booksxml")/booksxml
         where $book/isbn = $details/isbn
         return
        <book>
            <title>{$details/title/text()}</title>
            <quantity>{$book/@quantity}</quantity>
            <ISBN>{$book/isbn/text()}</ISBN>
            <publisher>{$details/manufacturer/text()}</publisher>
            <publishing-date>{$details/releaseDate/text()}</publishing-date>
        </book>
    } </order>

A single XQuery allows us to process an incoming order which uses a proprietary XML format, to augment the information available in that format with data stored in our MySQL database, and to return a result consistent with our desired XML canonical format.

Executing XQuery Programmatically

Data Integration Suite implements the XQJ (XQuery for Java API) interface, which provides an easy to use, flexible way to execute XQueries and to consume results. XQJ makes it easy for XQuery execution to become part of your Java application, or part of services exposed by your application server. If you are familiar with JDBC, you will notice many similarities between XQJ and JDBC. Take a look at the following code snip from a Java application:

...
     String xqueryUrl           = "C:/canonicalizeOrder.xquery";
     InputStream xqueryStream   = null;
     InputStream xmlOrderStream = null;
     XQConnection xqconnection  = null;
     XQExpression xqExpr        = null;
     try {
        DDXQDataSource dataSource = new DDXQDataSource();

        //Setup the connection to the MySQL database
        dataSource.setProperty(
          com.ddtek.xquery3.xqj.DDXQDataSource.JDBCURL,
          "jdbc:mysql://localhost:3306/books_dbo?user=user&password=pass");

        //Setup options and the XQuery source
        dataSource.setOptions("serialize=indent=yes");
        xqueryStream = new FileInputStream(xqueryUrl);
        dataSource.setBaseUri(xqueryUrl);
        xqconnection = dataSource.getConnection();

        //Bind the input order to the external variable
        //In this case we read the input from a file, but the
        //variable can be bound to any generic stream
        XQItemType type = xqconnection.createElementType(null,XQItemType.XQBASETYPE_UNTYPED);
        XQItemType documentType = xqconnection.createDocumentElementType(type);
        XQStaticContext context = xqconnection.getStaticContext();
        context.setBindingMode(XQConstants.BINDING_MODE_DEFERRED);
        context.setContextItemStaticType(documentType);
        xqExpr = xqconnection.createExpression(context);
        xmlOrderStream = new FileInputStream("c:/order.xml");
        xqExpr.bindDocument(new QName("","order"), xmlOrderStream);

        //Execute the XQuery; in this example just output the result
        //to System.Out
        xqExpr.executeQuery(xqueryStream).writeSequenceToResult(new StreamResult(System.out));
        System.out.println();
      } finally {
       ...
      }

In a "real life" scenario, the result of the XQuery would probably be consumed differently, either saving it to a file/stream, consuming it as SAX or StAX events, or maybe materializing it as an in-memory XML representation (DOM, the least scalable of the options).

Now we know how to merge data available in XML documents and MySQL, and how to format a result consistent with a specific XML Schema. But what if a partner company is sending us order information in something other than XML?


Next MySQL ExampleProcessing EDI as XML using MySQL