Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Beginning Regular Expressions 2005.pdf
Скачиваний:
95
Добавлен:
17.08.2013
Размер:
25.42 Mб
Скачать

Chapter 12

How It Works

The & metacharacter matches the text matched in the pattern in the Search For text box. In each case in this example, the matched text is the character sequence lk. That character sequence is replaced by the same character sequence, followed by the character sequence ing, so sulk becomes sulking and milk becomes milking after the replacement. As with any pattern, you must be careful to assess whether the pattern is suitable for the test data. If the test data included a word such as walks, it would be changed to walkings.

Lookahead and Lookbehind

Neither lookahead nor lookbehind is supported in OpenOffice.org Writer.

Search Example

The following search example finds occurrences of the words (strictly, the character sequences) Heaven and Hell in the same sentence.

The sample file, Heaven.txt, is shown here:

This sentence contains both the words Heaven and Hell.

This sentence does not contain those two words and therefore is not matched.

This paragraph has Heaven in the first sentence. And Hell in the second.

The problem definition can be expressed as follows:

Match the beginning-of-paragraph position, match zero or more characters, match the character sequence Heaven, match zero or more characters, match the character sequence Hell, match zero or more characters, and match a literal period character.

A pattern to implement the problem definition is ^.*Heaven.*Hell.*\..

Try It Out

Words in Proximity

1.Open OpenOffice.org Writer, and open the test file Heaven.txt.

2.Use the Ctrl+F keyboard shortcut to open the Find & Replace dialog box.

3.Check the Regular Expressions check box.

4.In the Search For text box, enter the pattern ^.*Heaven.*Hell.*\..

5.Click the Find All button, and inspect the results.

Figure 12-10 shows the appearance after Step 5. You may be surprised to see that both sentences in the third paragraph are highlighted as matches. That will be explained in the How It Works section in a moment.

294

Regular Expressions in StarOffice/OpenOffice.org Writer

Figure 12-10

If you want to match only occurrences in the same sentence, the current pattern is not sufficiently specific. You can modify the pattern to ^.*Heaven[^.]*Hell.*\..

6.Edit the pattern in the Search For text box to read ^.*Heaven[^.]*Hell.*\..

7.Click the Find All button, and inspect the results.

Figure 12-11 shows the appearance after Step 7. Notice that now, only the sentence in the first paragraph is highlighted as a match.

8.If the desire is to match two words only in the same paragraph, there is an alternate pattern that can be used. Edit the pattern in the Search For text box to ^.*Heaven.*Hell.*$.

9.Click the Find All button, and inspect the results. In the sample text, the highlighted text after Step 9 is the same as shown in Figure 12-10.

295

Chapter 12

How It Works

The pattern used up to Step 5 is ^.*Heaven.*Hell.*\.. The ^ metacharacter matches the beginning-of- paragraph position. The .* matches zero or more characters, and the Heaven matches the literal character sequence Heaven; the .* matches zero or more characters, and the Hell matches the literal character sequence Hell; the .* matches zero or more characters, and the \. matches a literal period character.

Figure 12-11

The match in the first paragraph is straightforward. However, the match in the third paragraph may be less obvious. The key part of the regular expression is the .* that follows Heaven and precedes Hell. Because OpenOffice.org Writer matches greedily, the .* can match the period character that occurs at the end of the first sentence. So it can match the occurrence of Heaven and Hell in two different sentences, as long as there is a period character following the character sequence Hell. If you delete the final period character in the third paragraph, the pattern ^.*Heaven.*Hell.*\ no longer matches.

The pattern in Step 6, ^.*Heaven[^.]*Hell.*\., has the pattern [^.]* between Heaven and Hell. That means that only characters that are not the period character can occur between the character sequences Heaven and Hell. A match is present only when the two character sequences occur in the same sentence, assuming that the period character is not omitted.

296