Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Beginning Regular Expressions 2005.pdf
Скачиваний:
95
Добавлен:
17.08.2013
Размер:
25.42 Mб
Скачать

13

Regular Expressions

Using findstr

The findstr utility is a command-line utility that can be used to find files containing a particular string pattern. The findstr utility allows searches similar to those carried out using Windows Search but also supports more specific searches that include regular expressions.

The findstr utility makes use of parameters supplied on the command line, as well as some standard and nonstandard regular expression syntax.

In this chapter, you will learn the following:

How to use findstr from the command line

How to use the regular expression metacharacters supported by findstr

Introducing findstr

The findstr utility is available on many recent versions of Windows and can be used, for example, from a Windows XP command line without having to set paths or environment variables.

The description of the findstr utility in this chapter is based on findstr in Windows XP Professional.

To confirm the presence and functioning of the findstr utility on your version of Windows simply type findstr /? at the command prompt. If all is well, as it should be, you will see a considerable amount of help information scrolling past in the command window. Figure 13-1 shows the final part of the help information to be displayed. Approximately another full screen of help information has scrolled out of sight.

Chapter 13

Figure 13-1

You cannot use the following command on its own from the command line, or you will receive a bad command error, as shown in Figure 13-2:

findstr

Figure 13-2

It is essential that you use one or more of the command-line switches and parameters that specify what findstr is to do.

Finding Literal Text

One of the simplest tasks that findstr can be used for is to match literal text. The general form of a findstr command to perform simple literal matching in a single file is as follows:

findstr “Text of interest” Filename.suffix

Strictly speaking, you supply a regular expression pattern that consists only of literal characters to be matched.

The test file, Hello.txt, is shown here:

Hello world!

Hello with initial upper-case.

hello with initial lower case.

Goodbye!

306

Regular Expressions Using findstr

Notice that two lines have Hello with an initial uppercase H, and one line has hello with an initial lowercase h.

Try It Out

Finding Literal Text

1.Open a command window, and navigate to the directory into which you downloaded the test file Hello.txt.

2.Type the following command at the command line:

findstr “Hello” Hello.txt

3.Press Return, and inspect the results returned by findstr. Figure 13-3 shows the result. The two lines containing Hello (initial uppercase H) are displayed, while the line containing hello (initial lowercase h) is not. This is because the default behavior of findstr is to match case sensitively.

Notice that the content of two lines is displayed, but no indication of the file they come from or the line number is given. When you use findstr to examine multiple files, that additional information is useful.

Figure 13-3

4.The sample file, Hello.txt, has everything neatly on separate lines, but not all documents are so simply structured. Therefore, it is often useful to have line numbers displayed along with the text on a particular line, because that allows you to scan to roughly the right point in a long document to see what the context is. To display line numbers from findstr, use the /n switch.

Type the following command on the command line, and press Return:

findstr /n “Hello” Hello.txt

5.Inspect the results returned when the /n switch was added to the command. Figure 13-4 shows the result. Notice that the line number is now displayed for each line of the test file that contains matching text.

Particularly when the command line has been repeated using F3 and then edited, the findstr utility can sometimes fail to find any matches even though matches exist. If you find an unexpected failure to match any results, I suggest that you type the desired command afresh. This, in my experience, fixes the problem.

Figure 13-4

307

 

Chapter 13

6.If you wish matching to be carried out case insensitively, you can use the /i switch. Type the following command at the command line, and press Return:

findstr /i /n “Hello” Hello.txt

Figure 13-5 shows the results. Notice that all three lines containing Hello or hello are now displayed.

Figure 13-5

There are some findstr command-line switches that substitute functionally for regular expressions’ metacharacters. They will be discussed in the relevant place when the supported metacharacters are covered in the next section.

Metacharacters Suppor ted by findstr

The findstr utility supports many regular expression patterns, but perhaps because it is used on the command line, the utility has many nonstandard pieces of regular expression syntax (refer to the following table).

Metacharacter

Meaning

 

 

.

Any character

*

Quantifier indicating zero or more occurrences

?

Not supported

+

Not supported

{n,m}

Not supported

^

Beginning-of-line position

$

End-of-line position

[... ]

Character class

\<

Beginning-of-word position

\>

End-of-word position

As noted in the preceding table, some metacharacters are not supported. The following table lists findstr command-line switches that perform functions similar to regular expression metacharacters in many other

308

Regular Expressions Using findstr

settings, as well as command-line switches with other meanings. Command-line switches that take arguments are described in a separate table.

Command-Line Switch

Equivalent Metacharacter or Other Meaning

 

 

/b

Matches when the following character(s) are at the beginning of a

 

line. Equivalent to the ^ metacharacter.

/e

Matches when the following character(s) are at the end of a line.

 

Equivalent to the $ metacharacter.

/p

Specifies that files containing nonprintable characters are skipped.

/offline

Specifies that only files with the offline attribute set are processed.

/o

Prints the offset of the character from the beginning of the file.

/m

Prints the filename if the file contains a match.

/n

Displays the line number for each line that matches and is displayed.

/v

Displays lines that do not contain a match.

/x

Constrains matches to match only if the whole line matches the regu-

 

lar expression. Similar to using the ^ and $ metacharacters in other

 

implementations.

/i

Specifies that regular expression matching is case insensitive. The

 

default matching is case sensitive.

/s

Means that the current directory and all its subdirectories are searched

 

for files that meet the file specification part of the command line.

/r

Specifies that the text inside paired double quotes is to be interpreted

 

as regular expressions. This is the default behavior even if the /r

 

switch is not specified.

/l

Means that regular expressions cannot be interpreted as regular

 

expressions. Instead, matching is literal.

The following command-line switches each take an argument that affects their behavior:

Command-Line Switch

Description

 

 

/f:file

The argument file is the name of a file that contains a list of files to

 

be searched.

/c:string

The argument string is a search string to be used literally.

/g:file

The argument file is the name of a file that contains a list of search

 

strings.

/d:dirlist

The argument dirlist is a comma-separated list of directories to be

 

searched.

/a:colorattribute

The argument colorattribute specifies a color attribute using two

 

hexadecimal digits.

 

 

309