0% found this document useful (0 votes)

4 views

2 Regular Expression

Uploaded by

Salam Abdulla

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

2 Regular Expression

Uploaded by

Salam Abdulla

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 23

University of sulaimani

College of science
Department of Computer Science

Computation
Regular Expression
Lecture four

Mzhda Hiwa Hama

2023-2024
Regular Expression

• Regular Expressions (shortened as "regex") are used in

many programming languages and tools. They can be
used in ﬁnding and extracting patterns in texts and
programs.

• Regular expressions are a way to search for substrings

("matches") in strings. This is done by searching with
"patterns" through the string.

• Regular expressions are useful tools in the design of

compilers for programming languages. Elemental objects
in a programming language, called tokens, such as the
variable names and constants, may be described with
• Using regular expressions, we can also specify and
validate forms of data such as passwords, e-mail
addresses, user IDs, etc.
Regular Expression’s
metacharacters

A bracket expression. Matches a single character that is

contained within the brackets. For example, [abc] matches "a",
[] "b", or "c". [a-z] specifies a range which matches any lowercase
letter from "a" to "z". These forms can be mixed: [abcx-z]
matches "a", "b", "c", "x", "y", or "z“.

. Matches any single character. Within bracket expressions,

the dot character matches a literal dot. For example, a.c
matches "abc", etc., but [a.c] matches only "a", ".", or "c".
Matches a single character that is not contained within the
brackets. For example, [^abc] matches any character other than
[^ ] "a", "b", or "c". [^a-z] matches any single character that is not a
lowercase letter from "a" to "z".
Regular Expression’s
metacharacters

() Defines a marked sub expression. The string matched within the

parentheses can be recalled later . A marked subexpression is also
called a block or capturing group. (abc)

* Matches the preceding element zero or more times. For example,

ab*c matches "ac", "abc", "abbbc", etc. [xyz]* matches "", "x", "y",
"z", "zx", "zyx", "xyzzy", and so on. (ab)* matches "", "ab", "abab",
"ababab", and so on.
? Matches the preceding element zero or one time. For example, ab?c
matches only "ac" or "abc".
Matches the preceding element one or more times. For example,
+ ab+c matches "abc", "abbc", "abbbc", and so on, but not "ac".
The choice (also known as alternation or set union) operator
matches either the expression before or the expression after the
| operator. For example, abc|def matches "abc" or "def".
Regular Expression’s
metacharacters

^ Matches the starting position within the string.

$ Matches the ending position of the string or the position just

before a string-ending newline.

{n} Matches Exactly the specified number of occurrences,

a{3} contains {aaa} , exactly three a

{m,n} Matches the preceding element at least m and not more than n
times. For example, a{3,5} matches only "aaa", "aaaa", and
"aaaaa".
Formal Definition of Regular
Expression
Say that R is a regular expression if R is

1. a for some a in the alphabet Σ, so a represented as {a}

2. ε, represent {ε} language.
3. ∅, represent { } empty language.

Note: Don’t confuse the regular expressions ε and ∅. The

expression ε represents the language containing a single string
—namely, the empty string—whereas ∅ represents the
language that doesn’t contain any strings.
Regular Expression Operation

1. Union(OR) : where R1 and R2 are regular expressions, then

(R1 ∪ R2), also written as( R1 | R2 or R1 + R2) is also a
regular expression. L(R1|R2) = L(R1) U L(R2).

2. Concatenation: (R1 ◦ R2), where R1 and R2 are regular

expressions then R1R2 (also written as R1.R2) is also a
regular expression. L(R1R2) = L(R1) concatenated with
L(R2).

3. Kleene closure(star): (R1*), where R1 is a regular

expression then R1* (the Kleene closure of R1) is also a
regular expression. L(R1*) = epsilon U L(R1) U L(R1R1) U
L(R1R1R1) U…
Regular Expression and
languages
• The origins of regular expressions lie in Automata
Theory and Formal Language Theory.

• We can use RE to identify Regular Languages.

• So, The value of regular expression is a language.

• Regular language is one accepted by some FA or

described by an RE.
Note

• In arithmetic, we can use the operations + and × to build up

expressions such as (5 + 3) × 4 . Similarly, we can use the
regular operations to build up expressions describing
languages, which are called regular expressions. An example
is: (0 ∪ 1)0 ∗ . The value of the arithmetic expression is the
number 32. The value of a regular expression is a language.

In arithmetic, we say that × has precedence over + to mean

that when there is a choice, we do the × operation first. Thus
in 2+3×4, the 3×4 is done before the addition. To have the
addition done first, we must add parentheses to obtain (2 +
3)×4. In regular expressions, the star operation is done first,
followed by concatenation, and finally union, unless
parentheses change the usual order.
Examples

• In the following instances, we assume that the alphabet

Σ is{0,1}.
1. 0*10* = {w|w contains a single 1}.
2. Σ*1Σ* ={w|w has at least one 1}. Σ*=(0+1) *
3. Σ*001Σ* ={w|w contains the string 001 as a substring}.
4. (ΣΣ)* = {w|w is a string of even length}.
5. (ΣΣΣ)* = {w|the length of w is a multiple of 3}.
6. 01∪10 = {01,10}.
7. (0∪ε)(1∪ε) = {ε,0,1,01}.
8. 1*∅= ∅. Concatenating the empty set to any set
yields the empty set.
9. (0 ᴜ 1 )* Consists of all possible strings of 0s and 1s

10. (0∑) ᴜ (∑1) Consists of all strings that start with

0 or end with 1.

11. The set of strings over {0,1} that end in 3 consecutive

1's.
(0 | 1)* 111

12. The set of strings over {0,1} that have at most one 1
0* | 0* 1 0*
Homework

• Write a regular expressions for each of the

following languages:

1. {w| w starts with a 0 or a 1 and followed by any

number of 0s}
2. {w| w contains the string 101 as a substring}
3. {w| w starts with the string 11 and ends with
10}
4. Start and end with same symbol.
5. {w| w contains at least three 1s}
Equivalence with Finite
Automata
• Every regular language is FA recognizable, ie. Any RE
can be converted into Finite Automata that
recognizes the language it describes, and vice versa.
Recall that a regular language is one that is
recognized by some ﬁnite automaton.

• Note: A language is regular if and only if some

regular expression describes it .
Example1

• We convert the regular expression (ab∪a)* to an NFA in a

sequence of stages. We build up from the smallest
subexpressions to larger subexpressions until we have an
NFA for the original expression, as shown in the following
diagram.
Example 2

• (a ᴜ b)* aba
Look Ahead and Look Behind
collectively called "lookaround"

You can have assertions in your pattern like lookahead or

behind to ensure that a substring does or does not occur.
These “look around” assertions are specified by putting
the substring checked for in a string, whose leading
characters are:

• ?= (for positive lookahead),

• ?! (negative lookahead),
• ?<= (positive lookbehind),
• ?<! (negative lookbehind).
Look Ahead and Look Behind…
cont’d
• Use ?! (for negative lookahead), if the query was to
avoid appearing a specific substring in a string. At
the beginning of the string

• Ex: ^(?!101)[01]* // Doesn’t have 101 at beginning

of the string.
Look Ahead and Look Behind…
cont’d

• Use ?= (for positive lookahead), if the query

required appearing a specific substring in a string.
At the beginning of the string

Ex: ^(?=101)[01]* // String must contain 101 at

beginning of the string.
Look Ahead and Look Behind…
cont’d
• Use ?<! (for negative lookbehind), if the query was to
avoid appearing a specific substring only at the end of
the string
Ex: ^[01]*(?<!101)$ // Doesn’t end with 101

• Use ?<= (for positive lookbehind), if the query required

appearing a specific substring only at the end of the
string

Ex: ^[01]*(?<=101)$ // must end with 101

• Note: always specify the end position with $ when using
lookbehind.

DK - English - For - Everyone - Course Book - Level - 1 - Beginner PDF
92% (78)
DK - English - For - Everyone - Course Book - Level - 1 - Beginner PDF
184 pages
Face2Face 2e INT SB
83% (18)
Face2Face 2e INT SB
178 pages
Bedtime For Monsters Planning
No ratings yet
Bedtime For Monsters Planning
2 pages
Regular Expression Syntax: Literals
No ratings yet
Regular Expression Syntax: Literals
5 pages
Unit 3 - Regular Expression
No ratings yet
Unit 3 - Regular Expression
45 pages
Units
No ratings yet
Units
118 pages
Day 3 - Regexps
No ratings yet
Day 3 - Regexps
52 pages
Automata
No ratings yet
Automata
65 pages
Theory of Computation
No ratings yet
Theory of Computation
112 pages
Sys LW-08EN Regex-Filters
No ratings yet
Sys LW-08EN Regex-Filters
31 pages
Exercises For Section 3.3
No ratings yet
Exercises For Section 3.3
8 pages
Regular Expressions: Regular Expression Syntax in Python
No ratings yet
Regular Expressions: Regular Expression Syntax in Python
11 pages
2 Lex
No ratings yet
2 Lex
45 pages
Lecture-2n-04032024-081220pm-19022025-105409am
No ratings yet
Lecture-2n-04032024-081220pm-19022025-105409am
38 pages
Regular Expressions Python
No ratings yet
Regular Expressions Python
26 pages
Regular Expression: Anab Batool Kazmi
No ratings yet
Regular Expression: Anab Batool Kazmi
32 pages
Java Regex Tutorial: Lars Vogel
No ratings yet
Java Regex Tutorial: Lars Vogel
20 pages
Regular Expressions
No ratings yet
Regular Expressions
5 pages
ECS 20 Chapter 12, Languages, Automata, Grammars: R R 1 2 N R N n-1 2 1 R
No ratings yet
ECS 20 Chapter 12, Languages, Automata, Grammars: R R 1 2 N R N n-1 2 1 R
4 pages
TPL lect 15 - 16
No ratings yet
TPL lect 15 - 16
5 pages
Module 4 - Regular Expression
No ratings yet
Module 4 - Regular Expression
20 pages
Cs3452 - Toc Notes
No ratings yet
Cs3452 - Toc Notes
117 pages
Lexical Analysis
No ratings yet
Lexical Analysis
41 pages
TOC Notes
No ratings yet
TOC Notes
14 pages
Formal Methods: Finite State Machine - Regular Expressions
No ratings yet
Formal Methods: Finite State Machine - Regular Expressions
14 pages
Why Study The Theory of Computation?: Implementations Come and Go
No ratings yet
Why Study The Theory of Computation?: Implementations Come and Go
68 pages
Autumata Cha1
No ratings yet
Autumata Cha1
20 pages
Regular Expression 01
No ratings yet
Regular Expression 01
48 pages
FLAT
No ratings yet
FLAT
85 pages
CHAPTER 10
No ratings yet
CHAPTER 10
28 pages
Unit3 Toc
No ratings yet
Unit3 Toc
97 pages
13B RegExp
No ratings yet
13B RegExp
38 pages
SE Compiler Chapter 2
No ratings yet
SE Compiler Chapter 2
16 pages
Lecture 7
No ratings yet
Lecture 7
70 pages
CC 2
No ratings yet
CC 2
65 pages
Automata Theory E-Content Document
100% (1)
Automata Theory E-Content Document
100 pages
Regular Expression For Vs
No ratings yet
Regular Expression For Vs
7 pages
COMP3.RegEx
No ratings yet
COMP3.RegEx
10 pages
Regular Expresions
No ratings yet
Regular Expresions
27 pages
2. Regular Expressions
No ratings yet
2. Regular Expressions
4 pages
Unit Ii
No ratings yet
Unit Ii
25 pages
Role of Lexical Analysis: Scanning
No ratings yet
Role of Lexical Analysis: Scanning
2 pages
Regex Cheat Sheet
No ratings yet
Regex Cheat Sheet
10 pages
Chapter 5 Regular Expressions, Rollover and Frames Regular Expression
No ratings yet
Chapter 5 Regular Expressions, Rollover and Frames Regular Expression
16 pages
Regular Grammars
100% (2)
Regular Grammars
46 pages
COS 335-Automata Theory and Formal Languages
No ratings yet
COS 335-Automata Theory and Formal Languages
9 pages
14.Regular Expression
No ratings yet
14.Regular Expression
3 pages
Formal Theory New_090050
No ratings yet
Formal Theory New_090050
53 pages
Class 10 Regular Expression
No ratings yet
Class 10 Regular Expression
26 pages
3 RegularExpressions
No ratings yet
3 RegularExpressions
25 pages
Regular Expression
No ratings yet
Regular Expression
20 pages
Cheats Hee Ten
No ratings yet
Cheats Hee Ten
14 pages
Lectur Notes TOC
No ratings yet
Lectur Notes TOC
116 pages
M2-MAIN
No ratings yet
M2-MAIN
41 pages
Lecture 6 Re Basics
No ratings yet
Lecture 6 Re Basics
12 pages
Unit I
No ratings yet
Unit I
37 pages
ch3 M.PPTX - 0
No ratings yet
ch3 M.PPTX - 0
46 pages
Module 3
No ratings yet
Module 3
7 pages
Automata Theory Answers
No ratings yet
Automata Theory Answers
33 pages
MCA Assignment MC0073
No ratings yet
MCA Assignment MC0073
21 pages
chapter two
No ratings yet
chapter two
59 pages
16 Java Regex
100% (8)
16 Java Regex
26 pages
Ian Talks Regex A-Z
From Everand
Ian Talks Regex A-Z
Ian Eress
No ratings yet
1 Finite Automata
No ratings yet
1 Finite Automata
62 pages
3 Syntax Analysis
No ratings yet
3 Syntax Analysis
42 pages
1 Compiler Phases
No ratings yet
1 Compiler Phases
30 pages
2 Lexical Analyzer
No ratings yet
2 Lexical Analyzer
21 pages
Neural Network Course
No ratings yet
Neural Network Course
6 pages
SECJ3303 202120221 Test1b Unlocked
No ratings yet
SECJ3303 202120221 Test1b Unlocked
10 pages
Algorithm Techniques Seminar
No ratings yet
Algorithm Techniques Seminar
11 pages
MidTermLabTest (2021)
No ratings yet
MidTermLabTest (2021)
10 pages
Original Slides by Daniel Liang Modified Slides by Salam Abdulla
No ratings yet
Original Slides by Daniel Liang Modified Slides by Salam Abdulla
112 pages
Demostrative Pronouns: Verb To Be (Positive, Negative, Interrogative, Short Answer (2)
No ratings yet
Demostrative Pronouns: Verb To Be (Positive, Negative, Interrogative, Short Answer (2)
7 pages
G10 - Quotation Marks
No ratings yet
G10 - Quotation Marks
4 pages
Rule 1. Number Agreement: Rules of Subject-Verb Agreement
No ratings yet
Rule 1. Number Agreement: Rules of Subject-Verb Agreement
5 pages
PLAN DE LECTIE 4th
No ratings yet
PLAN DE LECTIE 4th
4 pages
02 Day 2 - U-1-2b - Cantonese 6 Tones - Jyutping
No ratings yet
02 Day 2 - U-1-2b - Cantonese 6 Tones - Jyutping
22 pages
3rd Grade Lesson Plan-1
No ratings yet
3rd Grade Lesson Plan-1
122 pages
Chapter 2 Types of Communication
No ratings yet
Chapter 2 Types of Communication
30 pages
The Text of The Septuagint Its Corruptions and Their Emendations
100% (1)
The Text of The Septuagint Its Corruptions and Their Emendations
437 pages
DLL MTB-2 Q3 W4
No ratings yet
DLL MTB-2 Q3 W4
12 pages
COURSE HANDBOOK - MA in ELE
No ratings yet
COURSE HANDBOOK - MA in ELE
28 pages
Aptis Test
0% (1)
Aptis Test
1 page
4 - What Is A Dictionary
No ratings yet
4 - What Is A Dictionary
5 pages
Class-5 Unit-5 (Prose) Shabale (Sabala)
No ratings yet
Class-5 Unit-5 (Prose) Shabale (Sabala)
16 pages
Expert PTE B2 Answer
78% (9)
Expert PTE B2 Answer
100 pages
PREPARE 3 Grammar Plus Unit 18
No ratings yet
PREPARE 3 Grammar Plus Unit 18
2 pages
A Hundred Years of Milpa Alta Nahuatl.
100% (1)
A Hundred Years of Milpa Alta Nahuatl.
22 pages
Conditional Sentences Type 1 Positive and Negative Exercise 2
No ratings yet
Conditional Sentences Type 1 Positive and Negative Exercise 2
2 pages
English CG!
No ratings yet
English CG!
247 pages
Bai Tap Possessive Case
No ratings yet
Bai Tap Possessive Case
3 pages
English Language Day - 2023
No ratings yet
English Language Day - 2023
1 page
English GR.6 Adm Q4 M9 Module3
No ratings yet
English GR.6 Adm Q4 M9 Module3
3 pages
English For Hospital Administration-1
No ratings yet
English For Hospital Administration-1
147 pages
t02 Group 06
No ratings yet
t02 Group 06
6 pages
Structure
No ratings yet
Structure
23 pages
Pushto Learning Lessons
No ratings yet
Pushto Learning Lessons
225 pages
f1 English Qic 1st Term Exam-1
No ratings yet
f1 English Qic 1st Term Exam-1
3 pages
CHAPTER 1& 2 - Business Communication
No ratings yet
CHAPTER 1& 2 - Business Communication
95 pages