VBA - Regular Expressions in VBScript
VBA - Regular Expressions in VBScript
(Page 3 of 4 )
At this point regular expressions don’t seem any better that VBS’s own string functions. In fact,
they only seem to take more code! But that’s because you haven’t seen the magic of patterns
yet. What if you wanted to know how many words were in a sentence?
You can quickly see how patterns can make all of the difference. But what are patterns and how
do you make them? A pattern is a string of literal characters to be matched. However, there are
a series of reserved and escaped characters that can be used to control match positions,
occurrences, wild cards, and more. We’ll begin with match positions as listed in Table 1 below.
Symbol Description
^ Matches the beginning of a string.
“^This” would match the word “This” if it appeared at the beginning of a string.
$ Matches the end of a string.
“BxB” matches any letter x that does not appear at the beginning or end of a
word.
After positioning, you’ll want to match literal characters. Alphanumeric characters are treated as
literals. However, some of them have special meanings. Those characters must be escaped by a
back-slash.
Once you have the ability to match literal characters, you’ll probably find the need to expand a
bit. You may want to match any one character in a range of characters, or perhaps everything
except a specified character. This is done with character classes.
Symbo Description
l
[xyz] Matches any character is the character set. Hyphens denote ranges.
At this point, your patterns will still be matching one character at a time. To unleash the power of
regular expressions, you need to match repeating characters.
“d{2,3}” matches no less than two digits and no more than three.
? Matches 0 or 1 occurrence. Equivalent to {0, 1}.
Finally, grouping and alternation offer the ability to make extremely complex regular expressions.
Grouping allows you to match clauses. Alternation allows you to add more than one clause and
match any one of them.
Symbo Description
l
() Grouping creates a clause. Clauses may be nested.
Regular expressions also allow a feature called back referencing. Back referencing allows you to
reuse part of an expression. This is done by providing a back-slash followed by a digit. For
example, the expression “(w+)s+1” matches any one word that occurs twice in a row. In other
words, the same match must be made twice in a row.
(Page 4 of 4 )
Now that you have all of the tools, let’s look at how to make them work. Say we wanted to build
a regular expression to match a standard ten-digit phone number in the form of (xxx) xxx-xxxx.
The regular expression could easily begin as “(ddd) ddd-dddd”. This is a string of literals: an
opening parenthesis followed by three digits, a closing parenthesis, a space, three more digits, a
hyphen, and the last four digits.
If we apply repetition, we can condense this a bit to “(d{3}) d{3}-d{4}”. Both expressions mean
the same thing. Now what if we wanted to make the parentheses optional? Of course, if we do,
the space should be a hyphen. Enter grouping and alternation.
To begin, we need to specify what it should look like with parentheses. Thus our expression
should begin with “(d{3) ” as before. Now we want to add an alternate possibility in case
parentheses aren’t used. The expression then becomes “((d{3))|(d{3}-)”. This will match three
digits between parentheses followed by a space, OR three digits followed by a hyphen. We then
complete the expression by adding the remaining part of the match. The final expression looks
like “((d{3))|(d{3}-)d{3}-d{4}”. This expression would match either “(123) 456-7890” or “123-
456-7890.
Another example would be a common U.S. zip code. A U.S. zip code consists of five numeric
digits followed by an optional four more separated by a hyphen. Just as in the previous example,
you can use grouping to accomplish this quite easily. That expression would look like “d{5}(-
d{4})?”. Notice this time that I’m using the ? symbol to match either 0 or 1 of the last group.
You can see that building regular expressions can provide very powerful tools for matching and
replacing text strings. Very complex expressions can be built using a simple set of character
symbols. Stayed tuned for a future article that demonstrates more advance uses of VBScript
regular expressions.
If you’re interested in testing your regular expressions, I’ve built a Regular Expression Tester
HTML application using the regular expression test code foundhere. Until next time, keep coding!