Java Lect 17
Java Lect 17
Before RegEx
Wildcard
*.txt
My_report*.doc
Here the * indicates any number of any
characters.
!
Regular expressions (RegEx) tend to
be easier to write than they are to
read
What is RegEx
a regular expression-- a pattern that describes or
matches a set of strings
Matched text chunk of text which matches the
regular expression.
ca[trn]
Matches car, can, cat
the
Structure of RegEx
Made up of normal characters and
metacharacters.
Metacharacters special function
$ ^ . \ [] \( \) + ?
$ means end of line
^ start of line
Literal match
RegEx: cat will match the word cat
It will also match words like
concatenation , delicate, located,
modification
It is not desired sometimes ?
solution
Matching
Match the space before and after
cat
cat
? Still problem
Character class
Want to search in or on ..
So searching RegEx : [io]n will match in
and on both
[ ] : used to specify a set of character to
select from.
[a-h] : indicates set of all characters from a to
h
[4-9A-T]
Character class
It can also contain individual
characters as : [acV5y0]
[0-9] : ?
[0-9][0-9] :?
18[0-9][0-9]:?
10
Example
set of vowels
[aeiou]
set of consonents
[bcdfghjklmnpqrstvwxyz]
11
Negation
The absence of any character or set
of character can be shown using ^
symbol
12
Start/End of line
^ : indicates start of line
$ : indicates end of line
Example:
search lines starting with I
Use RegEx : ^I
search lines ending ending with is
Use RegEx : is$
13
match
. : Any character match
e.e : match all strings where first letter is e and
last is e.
Try e.e
14
Repeated match
* : match the previous character or
character-class zero or more times
be* : will match sequence of zero or
more e preceded by b
+ : similar to *
Only difference is that it matches
sequence of one or more.
15
Selecting a number
Single digit : [0-9]
When single digit is repeated zero or
more times it is a number.
(digit)repeat
[0-9]*
$[0-9]* : ?
\$[0-9]*
16
Selecting a word
17
Alternate match
| : symbol is used to specify
alternate match
Search: (above)|(below)
18
Search
Day Words
[a-z]*day
[a-z]+day
- [A-Z][a-z]+day
19
20
Search patterns
has, have, had
not, nt
((have)|(had)|(has))
(( )|(n't)|( not ))*
((have)|(had)|(has))(( )|(n't)|( not ))*
21
22
References
Editplus help pages
http://gnosis.cx/publish/programming/regular_exp
ressions.html
OReilly - Mastering Regular Expressions
Google regular expression tutorial
23
Thank you !
24