Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Beginning Regular Expressions 2005.pdf
Скачиваний:
95
Добавлен:
17.08.2013
Размер:
25.42 Mб
Скачать

Chapter 24

Escaping Metacharacters

When you want to match characters that are used as metacharacters in regular expression patterns, it is necessary to escape the metacharacter using a preceding backslash character.

The following table summarizes the escaped character combinations in W3C XML Schema and the character that is matched when the escaped character combination is used.

Escaped Character Combination

Character Matched

 

 

\n

Newline

\r

Carriage return

\\

\ (backslash)

\|

| (pipe)

\.

. (period)

\-

- (hyphen)

\^

^ (caret)

\?

?

\*

*

\+

+

\(

(

\)

)

\[

[

\]

]

\{

{

\}

}

 

 

Exercises

1.Modify Name.xsd so that the file Name2.xml, shown here, can be validated against it. Notice that the last two Name elements have content that does not match the existing pattern, \w+\s+\w+. A solution is provided in the file Name2.xsd as indicated by the value of the

Names element’s xsi:noNamespaceSchemaLocation attribute:

<?xml version=”1.0” encoding=”UTF-8”?>

<Names xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” xsi:noNamespaceSchemaLocation=”C:\BRegExp\Ch24\Name2.xsd”>

<Name>John Smith</Name> <Name>Alicia Manton</Name> <Name>Pierre Laval</Name>

616

Regular Expressions in W3C XML Schema

<Name>Maria Von Trapp</Name> <Name>John James Manton</Name>

</Names>

2.Specify a pattern using Unicode character classes that will match the following part numbers:

A99

BC9933

DEF88125

Z1

A sample document, PartNumbers.xml, is shown here for convenience:

<?xml version=”1.0” encoding=”UTF-8”?>

<PartNumbers xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” xsi:noNamespaceSchemaLocation=”C:\BRegExp\Ch24\PartNumbers.xsd”>

<PartNumber>A99</PartNumber>

<PartNumber>BC9933</PartNumber>

<PartNumber>DEF88125</PartNumber>

<PartNumber>Z1</PartNumber>

</PartNumbers>

617

25

Regular Expressions in Java

Java is a widely used programming language that can be used on a variety of platforms in addition to Windows. Several packages written in or for Java support regular expression functionality. However, because the java.util.regex package is now part of Java 2 and has an excellent spectrum of functionality, this chapter focuses only on the java.util.regex package, which is part of the official Sun Java downloads.

The regular expression support in Java allows validation of text, as well as searching and replacement of text. Java supports a particularly rich range of character classes, including standard regular expression character classes, POSIX character classes, and Unicode character classes. Other aspects of the java.util.regex package also provide rich functionality.

This chapter assumes that you have at least a basic understanding of Java coding. The examples are intended to demonstrate the use of the regular expression functionality in Java. The examples have deliberately been kept short and simple. If you have programmed in any modern programming language, the Java aspects of the examples in this chapter should be easy to follow. If you have no experience at all in Java, I suggest that you use a book such as Ivor Horton’s Beginning Java 2 (Wrox Press 2002) to provide the necessary foundational information.

In this chapter, you will learn the following:

About the java.util.regex package in Java 2 Standard Edition

The metacharacters supported in the java.util.regex package

How to use many of the metacharacters to match and replace text

How to use methods of the String class to apply regular expression functionality

The examples in this chapter have been tested against Java 5.0. The regular expression functionality in Java 5.0 is essentially unchanged from that previously supported.