Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Beginning Regular Expressions 2005.pdf
Скачиваний:
95
Добавлен:
17.08.2013
Размер:
25.42 Mб
Скачать

Character Classes

The simple output shows the test string that was supplied on the command line; the regular expression pattern that was used; and, if there are one or more matches, a list of each match or, if there was no match, a message indicating that no matches were found:

System.out.println(“INPUT: “ + TestString); System.out.println(“REGEX: “ + regex); while (m.find())

{

match = m.group(); System.out.println(“MATCH: “ + match); } // end while

if (match == null){ System.out.println(“There were no matches.”); } // end if

Try it out with strings containing other uppercase characters as input on the command line.

POSIX Character Classes

Some regular expression implementations support a very different character class notation: the POSIX character class notation. The POSIX approach uses a naming convention for a number of potentially useful character classes instead of specifying character classes in the way you saw earlier in this chapter. For example, instead of the character class [A-Za-z0-9], where the characters are listed, the POSIX character class uses [:alnum:], where alnum is an abbreviation for alphanumeric. Personally, I prefer the syntax used earlier in this chapter. However, because you may see code that uses POSIX character classes, this section gives brief information about them.

As an example, the [:alnum:] character class is shown.

The POSIX syntax is dependent on locale. The syntax described in this section relates to Englishlanguage locales.

The [:alnum:] Character Class

The [:alnum:] character class varies in how it is implemented in various tools. Broadly speaking, the [:alnum:] class is equivalent to the following character class:

[A-Za-z0-9]

However, there are different interpretations of [:alnum:].

139

Chapter 5

Try It Out

The [:alnum:] Class in OpenOffice.org Writer

In OpenOffice.org Writer it is necessary to add a ? quantifier (or other quantifier) to successfully use the [:alnum:] character class:

1.Open OpenOffice.org Writer, and open the sample file AlnumTest.txt.

2.Use the Ctrl+F keyboard shortcut to open the Find & Replace dialog box.

3.Check the Regular Expressions and Match Case check boxes, and enter the pattern [:alnum:]? in the Search For text box.

4.Click the Find All button, and inspect the highlighted text, as shown in Figure 5-21, to identify matches for the pattern [:alnum:]?.

Notice that the underscore character, which occurs twice in the final line of text in the sample file, is not matched by the [:alnum:]? pattern.

Figure 5-21

If Step 4 is replaced by clicking the Find button, assuming that the cursor is at the beginning of the test file, the initial uppercase A will be matched, because that is the first matching character.

140