Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Beginning Regular Expressions 2005.pdf
Скачиваний:
95
Добавлен:
17.08.2013
Размер:
25.42 Mб
Скачать

Chapter 26

{

print “There was a match: ‘$&’.\n”;

}

else

{

print “There was no match.”;

}

2.Save the code as VariableSubstitutionCharClass.pl.

3.Run the code, and inspect the displayed result, as shown in Figure 26-27. Notice that the match is A99.

Figure 26-27

How It Works

This example is similar to NegatedCharacterClass.pl. However, the value for the character class is supplied by variable substitution using the $toBeSubstituted variable. First, a value that would be interpretable between square brackets is assigned to the $toBeSubstituted variable:

my $toBeSubstituted = “A-D”;

Then the value assigned to the $myPattern variable uses the $toBeSubstituted variable to specify the character class at the beginning of the pattern:

my $myPattern = “[$toBeSubstituted]\\d{2}”;

The remainder of the example follows the code used in NegatedCharacterClass.pl. Because the operative character class is [A-D], the value A99 matches the pattern [A-D]\d{2}.

Using Lookahead

Lookahead tests the character sequence that follows some other part of a pattern. Both positive lookahead and negative lookahead are supported in Perl.

The positive lookahead syntax, (?= ... ), is used to specify what is being looked for after the other component of the regular expression pattern matches. The character(s) inside the lookahead are not captured.

The negative lookahead syntax, (?! ...), is used to specify what must not come after another component if the regular expression pattern is matched.

696

Regular Expressions in Perl

Try It Out

Using Positive Lookahead

1.Type the following code in a text editor, and save it as Lookahead.pl:

#!/usr/bin/perl -w use strict;

print “Enter a test string here: “; my $myTestString = <STDIN>; chomp($myTestString);

if ($myTestString =~ m/Star(?= Training)/)

{

print “There was a match which was ‘$&’.”;

}

else

{

print “There was no match.”;

}

2.Run the code. Enter I work for Star. as the test text, and press Return. Inspect the result.

3.Run the code again. Enter I work for Star Training. as the test text, and press Return. Inspect the result, as shown in Figure 26-28. Notice that with test text of I work for Star. there is no match, but when the test text is I work for Star Training. there is a match, which is the character sequence Star.

Figure 26-28

How It Works

The user enters a test string that is assigned to the variable $myTestString:

print “Enter a test string here: “;

my $myTestString = <STDIN>;

The chomp() operator removes the terminal newline character:

chomp($myTestString);

The if statement tests whether the value of $myTestString matches the pattern Star(?= Training):

if ($myTestString =~ m/Star(?= Training)/)

If the character sequence Star is matched (which it is in this example), the lookahead, (?= Training), tests whether Star is followed by a space character followed by the character sequence Training. Because it is, there is a match.

The following example shows how negative lookahead can be used.

697

Chapter 26

Try It Out

Using Negative Lookahead

1.Type the following code in a text editor, and save it as NegativeLookahead.pl:

#!/usr/bin/perl -w use strict;

print “Enter a test string here: “; my $myTestString = <STDIN>; chomp($myTestString);

if ($myTestString =~ m/Star(?! Training)/)

{

print “There was a match which was ‘$&’.”;

}

else

{

print “There was no match.”;

}

2.Run the code. Enter I work for Star. as the test text, and press Return. Inspect the result.

3.Run the code again. Enter I work for Star Training. as the test text, and press Return. Inspect the result, as shown in Figure 26-29. Notice that now the first test string matches and the second test string doesn’t. This is so because, not surprisingly, negative lookahead produces the opposite result to positive lookahead.

Figure 26-29

How It Works

The key change in the code is you now use a negative lookahead:

if ($myTestString =~ m/Star(?! Training)/)

When the test string is I work for Star. there is a match, because the character sequence Star is not followed by a space character and the character sequence Training. However, when the test string is I work for Star Training. there is no match, because the forbidden lookahead occurs.

Using Lookbehind

Lookbehind works similarly to lookahead, except that a character sequence that precedes another component of the regular expression pattern is the focus of interest.

Positive lookbehind is signified by the syntax (?<= ...). Negative lookbehind is signified by

(?<!...).

698

Regular Expressions in Perl

Try It Out

Using Lookbehind

1.Type the following code in a text editor, and save it as LookBehind.pl:

#!/usr/bin/perl -w use strict;

print “This tests positive lookbehind.\n”; print “Enter a test string here: “;

my $myTestString = <STDIN>; chomp($myTestString);

if ($myTestString =~ m/(?<=Star )Training/)

{

print “There was a match which was ‘$&’.”;

}

else

{

print “There was no match.”;

}

2.Run the code. Enter the test string Training is great!, and press the Return key. Inspect the displayed result.

3.Run the code again. Enter the test string Star Training is great!, and press the Return key. Inspect the displayed result, as shown in Figure 26-30. Notice that the character sequence Training is matched only when the character sequence Star followed by a space character comes before Training, as specified by the positive lookbehind.

Figure 26-30

How It Works

The key change is in the pattern to be matched. Notice that the pattern’s lookbehind component, (?<=Star ), comes before the character sequence Training:

if ($myTestString =~ m/(?<=Star )Training/)

When the test string is Star Training is great! there is a match, because the necessary character sequence (Star followed by a space character) precedes the character sequence Training.

699

Chapter 26

Using the Regular Expression Matching

Modes in Perl

The regular expression matching modes allow developers to control useful aspects of how regular expression patterns are applied.

The following table summarizes the regular expression matching modes in Perl.

Mode

Description

 

 

i

Matching is case insensitive.

x

Allows whitespace to be ignored.

g

Matching is global.

m

Matching treats the test text as multiple lines.

s

Matching treats the test text as a single line.

You have seen earlier in this chapter examples of using the i (case-insensitive matching) and g (global matching) modifiers. The following example illustrates the use of the x modifier to assist in documentation of complex regular expression patterns.

Try It Out

Using the x Modifier

1.Type the following code in a text editor, and save it as xModifier.pl:

#!/usr/bin/perl -w use strict;

print “This matches a US Zip code.\n”; print “Enter a test string here: “; my $myTestString = <STDIN>; chomp($myTestString);

if ($myTestString =~

m/\d{5} # Match five numeric digits

(-\d{4})? # Optionally match a hyphen followed by four numeric digits /x)

{

print “There was a match which was ‘$&’.”;

}

else

{

print “There was no match.”;

}

2.Run the code. Enter 12345 as a test string, and press the Return key. Inspect the displayed result.

3.Run the code again. Enter 12345-6789 as a test string, and press the Return key. Inspect the displayed result, as shown in Figure 26-31.

700