Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Andrey Adamovich - Groovy 2 Cookbook - 2013.pdf
Скачиваний:
44
Добавлен:
19.03.2016
Размер:
26.28 Mб
Скачать

Working with XML in Groovy

Step 2 shows how we can easily access the attributes and values of a movie, while iterating on the results.

Groovy cannot possibly know anything in advance about the elements and attributes that are available in the XML document. It happily compiles anyway. That's one capability that distinguishes a dynamic language.

Step 4 introduces the spread-dot operator: a shortcut to the collect method of a collection. The result is a new collection containing the outcome of the operation applied to each member of the original collection.

There's more...

To iterate over collections and transform each element of the collection in Groovy, the collect method can be used. The transformation is defined as a closure passed to the method:

assert [0,2,4,6] == (0..3).collect { it * 2 }

By using a spread-dot operator, we can rewrite the previous snippet as follows:

assert [0,2,4,6] == (0..3)*.multiply(2)

In the context of a GPath query for XML, the spread-dot accesses the properties of each node returned from the findAll method, movies produced after the year 1990.

See also

ff Reading XML using XmlParser ff Reading XML using XmlSlurper

ff http://groovy.codehaus.org/GPath

ff http://groovy.codehaus.org/api/groovy/util/slurpersupport/ GPathResult.html

Searching in XML with XPath

XPath is a W3C-standard query language for selecting nodes from an XML document. That is somewhat equivalent to SQL for databases or regular expressions for text. XPath is a very powerful query language, and it's beyond the scope of this book to delve into the extended XPath capabilities. This recipe will show some basic queries to select nodes and groups

of nodes.

178

www.it-ebooks.info

Chapter 5

Getting ready

Let's start as usual by defining an XML document that we can use for selecting nodes:

def todos = '''

<?xml version="1.0" ?> <todos>

<task created="2012-09-24" owner="max"> <title>Buy Milk</title> <priority>3</priority> <location>WalMart</location> <due>2012-09-25</due> <alarm-type>sms</alarm-type> <alert-before>1H</alert-before>

</task>

<task created="2012-09-27" owner="lana"> <title>Pay the rent</title> <priority>1</priority> <location>Computer</location> <due>2012-09-30</due> <alarm-type>email</alarm-type> <alert-before>1D</alert-before>

</task>

<task created="2012-09-21" owner="rick"> <title>Take out the trash</title> <priority>3</priority> <location>Home</location> <due>2012-09-22</due> <alarm-type>none</alarm-type> <alert-before/>

</task>

</todos>

'''

The previous snippet represents the data for an application for personal task management; there aren't enough of those these days! Surely no self-respecting to-do application comes without a powerful filtering feature, such as finding all due tasks, or showing only the task that I can execute in a specific place.

179

www.it-ebooks.info

Working with XML in Groovy

How to do it...

Let's go into the details of this recipe.

1.Before we can fire our XPath queries, the document has to be parsed using the Java

DOM API:

import javax.xml.parsers.DocumentBuilderFactory import javax.xml.xpath.*

def inputStream = new ByteArrayInputStream(todos.bytes) def myTodos = DocumentBuilderFactory.

newInstance(). newDocumentBuilder(). parse(inputStream). documentElement

2.Once the XML document is parsed, we can create an instance of the XPath engine:

def xpath = XPathFactory. newInstance(). newXPath()

3.Now we are ready to run some queries on our task list. The simplest thing to do is to print all the task names:

def nodes = xpath.evaluate( '//task', myTodos,

XPathConstants.NODESET

)

nodes.each {

println xpath.evaluate('title/text()', it)

}

4.The output yielded is as follows:

Buy Milk

Pay the rent

Take out the trash

5.OK, now that we've got the API basics out of the way, it's time for more complex queries. The next example shows how to print the titles in a more Groovy way:

xpath.evaluate(

'//task/title/text()',

myTodos,

XPathConstants.NODESET ).each { println it.nodeValue }

180

www.it-ebooks.info

Chapter 5

Note that we are using the API getNodeValue() method to extract the content of the node.

6.What about fetching only tasks with a low priority (that is 2, 3, 4, and so on)?

xpath.evaluate(

'//task[priority>1]/title/text()',

myTodos,

XPathConstants.NODESET ).each { println it.nodeValue }

7.The output yielded is as follows:

Buy Milk

Take out the trash

8.Naturally, it is also possible to filter by node attribute. For instance, to fetch all tasks assigned to lana:

xpath.evaluate(

"//task[@owner='lana']/title/text()",

myTodos,

XPathConstants.NODESET ).each { println it.nodeValue }

9.The output yielded is as follows:

Pay the rent

10.Finally, we are going to retrieve nodes based on the actual content. Let's build a slightly more complex query that retrieves tasks based on the content of the location and alarm-type tag:

xpath.evaluate(

'//task[location="Computer" and ' + 'contains(alarm-type, "email")]/' + 'title/text()',

myTodos,

XPathConstants.NODESET ).each { println it.nodeValue }

11.The output yielded is as follows:

Pay the rent

181

www.it-ebooks.info

Working with XML in Groovy

12.The previous snippet uses the XPath's contains keyword to probe for a string match in a specific tag. We can also use the same keyword to search on all tags:

xpath.evaluate(

"//*[contains(.,'WalMart')]/title/text()",

myTodos,

XPathConstants.NODESET ).each { println it.nodeValue }

13.The output yielded is as follows:

Buy milk

How it works...

In order to use XPath, we need to build a Java DOM parser. Neither XmlParser nor XmlSlurper offer XPath querying capabilities, so we have to resort to building a parser using the not-so-elegant Java API. In step 1, we create a new instance of

DocumentBuilderFactory from which we create a DocumentBuilder. The default factory implementation defined by this plugin mechanism is com.sun.org.apache.

xerces.internal.jaxp.DocumentBuilderFactoryImpl. The factory is used to produce a builder which parses the document.

The evaluate function used to run the XPath queries takes the following three parameters:

ff The actual XPath query string (in step 3, we use //task to select all the nodes named task)

ff

ff

The document

The desired return type

In step 3, the return type is a NodeList, a list implementation on which Groovy can easily iterate on. When the return type is not specified the default return type is a String.

There's more...

What if you need to find all the tasks due today in your task list?

XPath 1.0 (the default implementation bundled with the JDK 6 and 7, dating back to 1999) doesn't support date functions, and you are left to rather ugly string comparison tricks.

The more recent specification of XPath, v2.0, supports date functions and a plethora of new extremely powerful features. Luckily third-party libraries supporting XPath 2.0 are available and can be readily used with Groovy. One of these libraries is Saxon 9, which supports XSLT

2.0, XQuery 1.0, and XPath 2.0 at the basic level of conformance defined by W3C.

182

www.it-ebooks.info

Chapter 5

In this example, we are going to build a query that filters out the task due today. The XML data defined at the beginning of this recipe is used also in this snippet:

@Grab('net.sf.saxon:Saxon-HE:9.4') @GrabExclude('xml-apis:xml-apis')

import javax.xml.parsers.DocumentBuilderFactory import javax.xml.xpath.*

import net.sf.saxon.lib.NamespaceConstant

def today = '2012-09-21'

def todos = '''

<?xml version="1.0" ?> <todos>

...

</todos>

'''

def inputStream = new ByteArrayInputStream(todos.bytes) def myTodos = DocumentBuilderFactory.

newInstance(). newDocumentBuilder(). parse(inputStream). documentElement

//Set the SAXON XPath implementation

//by setting a System property System.setProperty(

'javax.xml.xpath.XPathFactory:' + NamespaceConstant.OBJECT_MODEL_SAXON, 'net.sf.saxon.xpath.XPathFactoryImpl'

)

//Create the XPath 2.0 engine

def xpathSaxon = XPathFactory. newInstance(

XPathConstants.DOM_OBJECT_MODEL ).newXPath()

//Print out all task names

//expiring on 22, September 2012 xpathSaxon.evaluate(

'//task[xs:date(due) = ' + 'xs:date("2012-09-22")]/title/text()',

183

www.it-ebooks.info

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]