Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Beginning Visual Basic 2005 Express Edition - From Novice To Professional (2006)

.pdf
Скачиваний:
387
Добавлен:
17.08.2013
Размер:
21.25 Mб
Скачать

390

C H A P T E R 1 5 W O R K I N G W I T H X M L

Try It Out: Using XPath with an XmlDocument

For this example, I’ll keep things uncomplicated and have you work with a simple console application. Go ahead and start up a new console-based project now.

When the code editor appears, add in some Imports statements. Obviously you’ll need to reference System.Xml again to work with XmlDocuments. You’ll also need System.Net once again to let you download the RSS feed you’ll work with:

Imports System.Xml

Imports System.Net

Module Module1

Sub Main()

End Sub

End Module

Just as before, the first thing you’ll need to do is actually grab the XML document (in our case this is the Apress blogs RSS feed) from a website. Add a couple of lines of code to Main() to do just that:

Sub Main()

Dim client As New WebClient()

Dim rssFeed As String = client.DownloadString( _ "http://blogs.apress.com/wp-rss2.php")

End Sub

Next, let’s load the downloaded feed into an XML document, just as before:

Sub Main()

Dim client As New WebClient()

Dim rssFeed As String = client.DownloadString( _ "http://blogs.apress.com/wp-rss2.php")

Dim doc As New XmlDocument() doc.LoadXml(rssFeed)

End Sub

C H A P T E R 1 5 W O R K I N G W I T H X M L

391

Okay, no great surprises so far. The next thing you want to do is grab all the titles of articles from this feed. Now, looking at the actual XML in a browser, you can see that articles are called <item> in an RSS feed. The <item> tags live inside <channel> tags, and <channel> tags live inside the main <rss> tag. So, the XPath query is simply "rss/channel/item".

The XmlDocument class provides a handy method called SelectNodes() that takes an XPath query as a parameter and returns an XMLNodeList. So, let’s add some code now to grab matching nodes:

Sub Main()

Dim client As New WebClient()

Dim rssFeed As String = client.DownloadString( _ "http://blogs.apress.com/wp-rss2.php")

Dim doc As New XmlDocument() doc.LoadXml(rssFeed)

Dim nodes As XmlNodeList = _ doc.SelectNodes("rss/channel/item/title")

End Sub

That’s it. That’s all there is to it. All that remains is the grunt work of actually iterating through the nodes now and printing out some information:

Sub Main()

Dim client As New WebClient()

Dim rssFeed As String = client.DownloadString( _ "http://blogs.apress.com/wp-rss2.php")

Dim doc As New XmlDocument() doc.LoadXml(rssFeed)

Dim nodes As XmlNodeList = _ doc.SelectNodes("rss/channel/item/title")

For Each node As XmlNode In nodes

Console.WriteLine(node.InnerText)

Next

Console.WriteLine()

Console.ReadLine()

End Sub

392

C H A P T E R 1 5 W O R K I N G W I T H X M L

Notice how inside the loop we print out the node’s InnerText property, not its Value property. This threw me for a loop when I first started working with XML. Surely if I have a title node I want to get at its value to find out the title of the article in question, right? Wrong. The text inside a node, like this

<title>My First Blog Post</title>

is itself a node. You can see this if you explore the RSS feed with our lister application from earlier. The text inside an element is a special text node, and yes, it has a value you can work with. If, however, you have iterated through a list of element nodes (like title), you can get at the text inside the element only by looking at the element’s InnerText.

Run the application now and you’ll see the output in Figure 15-4.

Figure 15-4. Using XPath to iterate through a bunch of specific nodes is easy.

Try It Out: Using XPathNavigator

Again, to focus on the code, we’ll stick with a console application. Start a new console application project in Visual Basic 2005 Express. Because we’re still working with XML, don’t forget to add an Imports System.Xml statement to the top of Module1.vb when the code editor appears.

C H A P T E R 1 5 W O R K I N G W I T H X M L

393

The first thing you are going to need is an XML document. Because the XmlDocument class can load XML data from a string, you’ll build a document right inside a string. Go ahead and add this code to the top of the

Main() function in Module1.vb:

Sub Main()

Dim xml As String = _ "<Order>" + _

"<Item>" + _

"<Description>Some widget part</Description>" + _ "<Price>12.99</Price>" + _

"</Item>" + _ "<Item>" + _

"<Description>Another widget</Description>" + _ "<Price>50.12</Price>" + _

"</Item>" + _ "</Order>"

End Sub

So, this XML document represents the canonical Order-Items example. It contains a single order that is composed of a couple of items, each with a price and a description. What we’d like to do is sum the prices with an XPath query to get a total order price. The first step of course is going to be creating an XmlDocument object and loading the XML into it:

Sub Main()

Dim xml As String = _ "<Order>" + _

"<Item>" + _

"<Description>Some widget part</Description>" + _ "<Price>12.99</Price>" + _

"</Item>" + _ "<Item>" + _

"<Description>Another widget</Description>" + _ "<Price>50.12</Price>" + _

"</Item>" + _ "</Order>"

Dim doc As New XmlDocument() doc.LoadXml(xml)

End Sub

394

C H A P T E R 1 5 W O R K I N G W I T H X M L

Now, to run complex XPath statements (that is, statements that do more than just grab a node by name), you’ll need an XPathNavigator document. XPathNavigator is a special class that does nothing more than allow you to move around the results of an XPath query, and also evaluate numeric queries. It’s created from an XmlNode object. For example, if you wanted to run a query on a specific part of a document, you could select the node to start from and then create a navigator on it to run the query on that node and its children. It’s okay to just get a navigator from the document itself. XPathNavigator, though, is part of the System.Xml.XPath namespace, so you’ll also need to add an Imports statement for that to the head of the Module1.vb file:

Imports System.Xml

Imports System.Xml.XPath

Module Module1

Sub Main()

Dim xml As String = _ "<Order>" + _

"<Item>" + _

"<Description>Some widget part</Description>" + _ "<Price>12.99</Price>" + _

"</Item>" + _ "<Item>" + _

"<Description>Another widget</Description>" + _ "<Price>50.12</Price>" + _

"</Item>" + _ "</Order>"

Dim doc As New XmlDocument() doc.LoadXml(xml)

Dim nav As XPathNavigator = doc.CreateNavigator()

End Sub

End Module

C H A P T E R 1 5 W O R K I N G W I T H X M L

395

All that you need now is to evaluate an XPath query. What you want to do is sum the values in the Order/ Item/Price nodes. That’s easy. You can simply use the XPath statement sum(Order/Item/Price) to do just that. The XPathNavigator class has a special method called Evaluate() that you can call to evaluate queries like this. Let’s go ahead and call it, outputting the result to the console:

Sub Main()

Dim xml As String = _ "<Order>" + _

"<Item>" + _

"<Description>Some widget part</Description>" + _ "<Price>12.99</Price>" + _

"</Item>" + _ "<Item>" + _

"<Description>Another widget</Description>" + _ "<Price>50.12</Price>" + _

"</Item>" + _ "</Order>"

Dim doc As New XmlDocument() doc.LoadXml(xml)

Dim nav As XPathNavigator = doc.CreateNavigator() Console.WriteLine("Total price for this order is ${0}", _

nav.Evaluate("sum(Order/Item/Price)"))

Console.ReadLine()

End Sub

Run the program now and you’ll see the output shown in Figure 15-5.

396

C H A P T E R 1 5 W O R K I N G W I T H X M L

Figure 15-5. XPath queries let us easily perform aggregations on data inside an XML document.

The XPathNavigator object can do more than just evaluate aggregate XPath expressions. We can also pass fairly complex conditions to the navigator to grab nodes. For example, let’s modify the code to print out the list of all items in the order with a value greater than 10.00.

Change the code after the creation of the XPathNavigator to look like the following highlighted code:

Sub Main()

Dim xml As String = _ "<Order>" + _

"<Item>" + _

"<Description>Some widget part</Description>" + _ "<Price>12.99</Price>" + _

"</Item>" + _ "<Item>" + _

"<Description>Another widget</Description>" + _ "<Price>50.12</Price>" + _

"</Item>" + _ "</Order>"

Dim doc As New XmlDocument() doc.LoadXml(xml)

C H A P T E R 1 5 W O R K I N G W I T H X M L

397

Dim nav As XPathNavigator = doc.CreateNavigator()

Dim nodes As XPathNodeIterator = _ nav.Select("/Order/Item[Price>10]/Price")

While nodes.MoveNext() Console.WriteLine("Price is {0}", _

nodes.Current.Value) End While

Console.ReadLine()

End Sub

When you run the program this time, it will print out any price greater than 10 in value (which in this example is actually both of them—feel free to change the number 10 on the second line of code to prove that the code really does work). You can see the output in Figure 15-6.

Figure 15-6. We can use complex XPath statements to grab only a subset of all the nodes in the document.

I’ll talk you through the code. Just like XmlDocument, XPathNavigator has a Select() method that you can call to get at a bunch of nodes. It returns a type called XPathNodeIterator that lets you move through the resulting nodes one by one. It has two things we are interested in: a MoveNext() method to move from result node to result node, and a Current property to get at the current node.

After grabbing an iterator then, the next thing the code does is set up a while loop to repeatedly call MoveNext() to move through the nodes.

Within the loop you just grab the XPathNodeIterator’s Current.Value property to get at the prices you want.

398

C H A P T E R 1 5 W O R K I N G W I T H X M L

The XPath query itself is interesting. As I mentioned, there’s a lot to XPath, and it’s worth consulting the online reference for it, but the kind of query being done here is very common so I’ll explain what it does. The query string itself is /Order/Item[Price>10]/Price. Working from right to left, this says that we want the Price element of every Item that has a child element called Price with a value greater than 10, of every Order.

Let’s change it. Grabbing the prices is great, but really we are going to be more inclined to want to know the description of the item itself. All you need to do is change the query to return the Description element instead of the price, like this:

Sub Main()

Dim xml As String = _ "<Order>" + _

"<Item>" + _

"<Description>Some widget part</Description>" + _ "<Price>12.99</Price>" + _

"</Item>" + _ "<Item>" + _

"<Description>Another widget</Description>" + _ "<Price>50.12</Price>" + _

"</Item>" + _ "</Order>"

Dim doc As New XmlDocument() doc.LoadXml(xml)

Dim nav As XPathNavigator = doc.CreateNavigator() Dim nodes As XPathNodeIterator = _

nav.Select("/Order/Item[Price>10]/Description")

While nodes.MoveNext()

Console.WriteLine("Item {0} has a price greater than 10", _ nodes.Current.Value)

End While

Console.ReadLine()

End Sub

C H A P T E R 1 5 W O R K I N G W I T H X M L

399

Now when you run the application, you’ll see the output changes to list the description of all items with a price greater than 10 (or whatever you’ve subsequently changed the code to report). You can see this in Figure 15-7.

Figure 15-7. It’s easy to select specific nodes based on conditions against other ones.

Aside from selecting nodes and evaluation expressions, XPathNavigator also lets you move around a document (hence its name) with a handy set of Move methods, such as

MoveNext(), MoveToFirstChild(), MovePrevious(), and so on. In fact, you can even use the

Insert methods on the navigator to add nodes to a document. In all fairness, though, very few of you will ever do this— for a couple of reasons. First, XmlReader and XmlWriter (which we’ll look at in a moment) are generally easier to work with, although they do add some code overhead. Second, most people don’t need to do that much complex stuff with XML beyond reading and understanding a document in its entirety, or writing a new document based on an object in code. We’ll cover all those things in the sections that follow.

If, however, you are a hard-core XML head, I strongly encourage you to read up on the methods in XPathNavigator and of course the XML and XPath references online. .NET’s support for XML is really complete enough to fill an entire book all its own.

Reading XML Files

If you wanted to read an RSS feed and extract information from it, perhaps a listing of all the titles of articles in the feed, you could iterate through all the nodes in the XML document itself searching for one called item. When you found that node, you could then iterate through all its child nodes searching for one called title and then write even more code to extract the value of that element and print it out. You could craft an XPath expression and extract the nodes that way, as you just saw.

The XMLReader class provides an alternate, and more “.NET” way, of doing things. With the XML reader, you get intuitively named methods for navigating around a document, the ability to extract a document’s contents as native .NET data types (string, int, bool, and so on), and you get a very lightweight tool for accessing huge amounts of data.