Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Beginning Regular Expressions 2005.pdf
Скачиваний:
95
Добавлен:
17.08.2013
Размер:
25.42 Mб
Скачать

Chapter 20

Metacharacter

Description

 

 

*

A quantifier. It matches when there are zero or more occurrences of

 

the preceding character or group.

+

A quantifier. It matches when there are one or more occurrences of

 

the preceding character or group.

{n,m}

Quantifier notation. It matches when there is a minimum of n and a

 

maximum of m occurrences of the preceding character or group.

( ... )

Grouping parentheses.

(?: ... )

Nongrouping parentheses.

(?= ... )

Positive lookahead.

(?! ... )

Negative lookahead.

|

Alternation.

[ ... ]

Character class. Character class ranges are supported.

[^ ... ]

Negated character class.

\b

Matches the boundary between alphanumeric characters and nonal-

 

phanumeric characters. In effect, it can match the boundary at the

 

beginning or end of a “word.”

\B

Matches a position that does not match \b.

\d

Matches a numeric digit. It is equivalent to the character class [0-9].

\D

Matches a character that is not a numeric digit. It is equivalent to the

 

negated character class [^0-9].

\s

Matches any whitespace character.

\S

Matches any character that does not match \s.

\t

Matches a tab character.

\w

Matches any alphanumeric “word” character. It is equivalent to the

 

character class [A-Za-z0-9_].

\W

Matches any character that does not match \w. It is equivalent to the

 

negated character class [^A-Za-z0-9_].

 

 

Quantifiers

VBScript supports a full range of quantifiers — that is, the ?, *, + metacharacters, together with the {n,m} notation.

The usage of these quantifiers in VBScript is standard.

474

Regular Expressions and VBScript

Positional Metacharacters

The ^ metacharacter is supported and matches the position before the first character of a character sequence. The $ metacharacter is also supported and matches the position after the last character of an input character sequence.

The test file, Positional.html, is shown here:

<html>

<head>

<title>Positional Metacharacters</title>

<script language=”vbscript” type=”text/vbscript”> Dim myRegExp, TestString, displayString, MatchOrNot

Function FindMatch displayString = “”

Set myRegExp = new RegExp myRegExp.Pattern = “[A-Z]\d{2}” myRegExp.IgnoreCase = False myRegExp.Global = False

TestString = InputBox(“Enter one alphabetic character and two numbers in the text box below.”)

MatchOrNot = myRegexp.Test(TestString) If MatchOrNot Then

displayString = “When the pattern is ‘“ & myRegExp.Pattern & “‘ the input ‘“ _

&TestString & “‘ contains a match.” Else

displayString = “When the pattern is ‘“ & myRegExp.Pattern & “‘ the input ‘“ _

&TestString & “‘ does not contain a match.”

End If

myRegExp.Pattern = “^[A-Z]\d{2}$” MatchOrNot = myRegexp.Test(TestString) If MatchOrNot Then

displayString = displayString & VBCrLf & “When the pattern is ‘“ & myRegExp.Pattern

&“‘ the input ‘“ _

&TestString & “‘ contains a match.” Else

displayString = displayString & VBCrLf & “When the pattern is ‘“ & myRegExp.Pattern

&“‘ the input ‘“ _

&TestString & “‘ does not contain a match.”

End If

MsgBox displayString

End Function

</script>

</head>

<body onload=”FindMatch”>

</body>

</html>

The code matches the character sequence entered into the input box against two patterns. The first does not include the positional metacharacters ^ and $. The second pattern includes both metacharacters.

475

Chapter 20

Try It Out

Positional Metacharacters

1.Open Positional.html in Internet Explorer.

2.In the text box in the input box, enter the character sequence A99.

3.Click the OK button, and inspect the information displayed in the message box, as shown in Figure 20-12. The message box shows the results of attempted matching when there are no positional metacharacters present and when both positional metacharacters are present. Notice that there is a match in both situations.

Figure 20-12

4.Click the OK button to dismiss the message box and then press F5 to reload the Web page.

5.In the text box in the input box, enter the character sequence A999.

6.Click the OK button, and inspect the information displayed in the message box, as shown in Figure 20-13. Notice that there is a match when no positional metacharacters are present but no match when the positional metacharacters are present in the pattern.

Figure 20-13

7.Click the OK button to dismiss the message box and then press F5 to reload the Web page.

8.In the text box in the input box, enter the character sequence A2A.

9.Click the OK button, and inspect the information displayed in the message box, as shown in Figure 20-14. There is no match with either pattern.

Figure 20-14

476

Regular Expressions and VBScript

How It Works

When the page is loaded, the FindMatch function is called:

<body onload=”FindMatch”>

The FindMatch function twice uses the RegExp object’s Test() method to attempt to match a string input by the user.

Initially, the value assigned to the Pattern property is the pattern [A-Z]\d{2}, which matches a single alphabetic character followed by two numeric digits:

myRegExp.Pattern = “[A-Z]\d{2}”

Because the IgnoreCase and Global properties are each set to False, the alphabetic character can be entered in either case, and only one match is attempted:

myRegExp.IgnoreCase = False

myRegExp.Global = False

The string that is input by the user is assigned to the TestString variable:

TestString = InputBox(“Enter one alphabetic character and two numbers in the text

box below.”)

And the result of the Test() method, with the TestString variable as its argument, is assigned to the variable MatchOrNot, which contains a Boolean value. The MatchOrNot variable either contains a nonzero length string (which is equivalent to Boolean True) or the empty string (which is equivalent to the Boolean value False):

MatchOrNot = myRegexp.Test(TestString)

If there is a match, the pattern is output together with the test string it matches:

If MatchOrNot

Then

 

displayString

= “When the pattern is ‘“ & myRegExp.Pattern & “‘ the input ‘“ _

& TestString

& “‘

contains a match.”

 

 

 

If there is no match, a message that the pattern is not matched by the input string is output:

Else

displayString = “When the pattern is ‘“ & myRegExp.Pattern & “‘ the input ‘“ _

& TestString & “‘ does not contain a match.” End If

The displayString variable now contains the result of the first attempted matching process. Now, however, the value of the Pattern property is changed to allow a second attempted match using a pattern that now includes both the ^ and $ metacharacters:

myRegExp.Pattern = “^[A-Z]\d{2}$”

477