Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Beginning Regular Expressions 2005.pdf
Скачиваний:
95
Добавлен:
17.08.2013
Размер:
25.42 Mб
Скачать

Parentheses in Regular Expressions

against the numeric digit 3. There is a match. All the components inside the parentheses have been matched once. Because there is a {2} quantifier, the regular expressions engine next attempts to match the first component inside the parentheses, the literal character A, against the third character of line 1, the uppercase A. There is a match. It next attempts to match the second component inside the parentheses, the metacharacter \d, against the fourth character of line 1, the numeric digit 4. There is a match. The components inside the parentheses have been matched twice, so matching is complete, and all components of the pattern have matched. Therefore, the entire pattern has matched, and the sequence of characters A3A4 is highlighted in PowerGrep.

Matching Literal Parentheses

Because the opening ( and closing ) parentheses characters have special functions in regular expression patterns, they cannot be used to match the corresponding literal characters. To match an opening parenthesis, the following pattern is used:

\(

To match a closing parenthesis, the pattern needed is:

\)

Suppose that you want to match the text (Home) in the following:

Tel. 123 456 7890 (Home)

You would use this pattern:

\(Home\)

U.S. Telephone Number Example

One practical use for metacharacters that match literal parentheses is in matching sequences of characters that form U.S. telephone numbers.

Several formats can be used for U.S. telephone numbers. For the purpose of this example, assume that the following format is the one that the data source should contain:

(123) 123-4567

A problem definition for that structure could read as follows:

Match an opening parenthesis, followed by three numeric digits, followed by a closing parenthesis, followed by a space character, followed by three numeric digits, followed by a hyphen, followed by four numeric digits.

If you use a character class to match numeric digits, the following pattern can be used:

\(\d{3}\) \d{3}-\d{4}

175

Chapter 7

Try It Out

Phone Number Example

This example tests the preceding pattern to match the specified U.S. telephone number format. The test file, PhoneNumbers.txt, is shown here:

(987) 133-4477

(123) 876-3456

123-456-7890

(898 123-1234

879) 345-8765

Only the first two telephone numbers correspond to the problem definition previously stated.

1.Open PowerGrep, and enter the regular expression pattern \(\d{3}\) \d{3}-\d{4} in the Search text area.

2.Enter the folder name C:\BRegExp\Ch07 in the Folder text box.

3.Enter the filename PhoneNumbers.txt in the File Mask text box.

4.Click the Search button, and inspect the outcome displayed in the Results area.

Figure 7-3 shows the appearance after Step 4.

Figure 7-3

How It Works

First, let’s look at how the test text (987) 133-4477 on the first line matches. Assuming that the regular expression engine is at the position immediately before the opening parenthesis, it attempts to match the \( metacharacter against the opening parenthesis character, (. There is a match. Next, the pattern \d{3} is matched. Because the second, third, and fourth characters of the test string are numeric digits, there is

176