Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Beginning Regular Expressions 2005.pdf
Скачиваний:
95
Добавлен:
17.08.2013
Размер:
25.42 Mб
Скачать

14

PowerGREP

PowerGREP is a powerful regular expressions tool. It is a commercial Windows product that provides a tool with a graphical user interface (GUI) that implements much of the functionality that is available with grep, egrep, and similar tools on the Unix and Linux platforms. Further information about PowerGREP is available at www.powergrep.com.

Compared to the findstr utility, PowerGREP avoids the need to learn command string arguments. PowerGREP also has a much more complete implementation of regular expression functionality. In addition, it can carry out replace operations that are beyond the capabilities of findstr.

In this chapter, you will learn the following:

How to use the PowerGREP interface

How to use the extensive range of regular expressions functionality that PowerGREP supports

How to use PowerGREP to perform example search or search-and-replace operations, some across multiple files

Examples in this chapter were checked using PowerGREP version 2.3.1.

The PowerGREP Interface

If you haven’t used PowerGREP before, the appearance on first starting the program will be like that shown in Figure 14-1. If you have used PowerGREP before, the most recently used regular expression pattern, folder choice, and file mask will be displayed. If you have used PowerGREP at all, you will find residual results in the results pane, as you can see in Figure 14-1.

Chapter 14

Figure 14-1

A Simple Find Example

The following example uses the test text, Regex.txt, shown here:

This is regular but not an expression.

Here is a simple regular expression pattern: \d.

Regex is an abbreviation for regular expression.

Some people use the abbreviation regexp.

The plural of regex is regexes.

The test text contains various words that refer to regular expressions. Notice that sometimes the term regular expression is used; sometimes it’s the abbreviation regex; and sometimes the less common abbreviation, regexp, is used.

The problem definition can be stated as follows:

Match any occurrence of the text regular expressions or its abbreviations.

Of course, to meaningfully translate that into a regular expression, you need to refine the problem definition to achieve more precision. One possible refinement is the following:

Match any of the following:

The literal character sequence regular expression

The literal character sequence regex

The literal character sequence regexp

There are various options for expressing this as a regular expression. One option is simple alternation:

(regular expression|regex|regexp)

326

PowerGREP

This option has the advantage of simplicity and readability.

Another option, exploiting the common characters among the desired matches, is as follows:

reg(ular expression|ex|exp)

It’s slightly shorter but arguably less readable.

If you wish for maximum succinctness, you could use the following:

reg(ular expression|(ex)p?)

However, again, readability is less than with the longer simple alternation option.

Try It Out

Simple Find

1.Open PowerGREP, and in the Search text area, type (regular expression|regex|regexp).

2.Ensure that the Regular Expressions check box is checked.

3.In the Folder text box, type C:\BRegExp\Ch14, assuming that you downloaded the code file to the C: drive and unzipped it into the BRegExp directory. Adjust accordingly if you downloaded and unzipped it to another location.

4.In the File Mask text area, type Regex.txt, and click the Search button.

5.Inspect the results, as shown in Figure 14-2. Notice that there are six matches. If you compare the content of Regex.txt with results displayed in PowerGREP, you will see that all occurrences of regular expression, regex, or regexp have been matched.

6.In the Search text area, type the alternate regular expression, reg(ular expression|ex|exp), and inspect the results.

7.In the Search text area, type the alternate regular expression, reg(ular expression|(ex)p?), and inspect the results.

Figure 14-2

327

Chapter 14

The results in the results pane should be identical to those shown in Figure 14-2. The regular expression pattern in the Search area is, of course, different, as described in Steps 6 and 7.

How It Works

Look at the first regular expression, (regular expression|regex|regexp). Matching is achieved in a straightforward way, because you have three literal strings to be matched, each of which is an option. The regular expression engine first attempts to match regular expression; if that’s unsuccessful, it attempts to match regex; if that’s unsuccessful, it attempts to match regexp.

On Line 1, the character sequence regular and expression both occur, but there are intervening characters, so the pattern does not match.

On Line 3 and Line 5, regular expression is matched.

On Line 5 (once), on Line 7 (once), and on Line 9 (twice) regex is matched.

The pattern regexp is never matched (see the comment at the end of this section).

Now look at matching the first alternate regular expression, reg(ular expression|ex|exp)and, with the second alternate regular expression, reg(ular expression|(ex)p?).

On Line 1, the character sequence reg matches, but none of the three options can be matched against the characters that follow in the pattern, so there is no match on Line 1.

On lines 3 and 5, the character sequence reg matches; therefore, the options are tested. The first option, ular expression, matches on those lines for both patterns.

On Line 5 (once), Line 7 (once), and Line 9 (twice), the character sequence reg matches. The first option, ular expressions, doesn’t match, but the second option ex or (ex) does match. So the character sequence regex is matched on each line.

There is a flaw in the matching strategy in this example. If you spotted it as you worked through the example, you will have the opportunity to correct the problem in an exercise later in this chapter.

The Replace Tab

Among the tabs in PowerGREP is the Replace tab, which allows the user to define how text replacement is to take place.

Figure 14-3 shows the Replace tab just after the example in the preceding section has run. Notice that the results from the Find tab are still displayed. That can be useful because, for example, the results from the Find tab allow you to see what matches and, therefore, what may be changed.

The following exercise tests the possible replacement of any occurrence of the character sequences regular expression, regex, or regexp with the character sequence regex.

328