Python Re

This document provides an overview of regular expressions in Python, detailing their syntax and usage for string manipulation tasks such as searching, matching, and replacing. It explains various patterns and functions, including how to compile regular expressions and use match objects. Additionally, it includes an example of implementing a Pig Latin converter using regular expressions.

Uploaded by

Azddine Elhamdaoui

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Python Re

Uploaded by

Azddine Elhamdaoui

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 18

Python

regular expressions
Regular Expressions

 Regular expressions are a powerful string

manipulation tool
 All modern languages have similar library
packages for regular expressions
 Use regular expressions to:
• Search a string (search and match)
• Replace parts of a string (sub)
• Break stings into smaller pieces (split)
Regular Expression Python Syntax
 Most characters match themselves
The regular expression “test” matches the
string ‘test’, and only that string
 [x] matches any one of a list of characters
“[abc]” matches ‘a’,‘b’,or ‘c’
 [^x] matches any one character that is not
included in x
“[^abc]” matches any single character except
‘a’,’b’,or ‘c’
Regular Expressions Syntax

 “.” matches any single character

 Parentheses can be used for grouping
“(abc)+” matches ’abc’, ‘abcabc’,
‘abcabcabc’, etc.
 x|y matches x or y
“this|that” matches ‘this’ and ‘that’,
but not ‘thisthat’.
Regular Expression Syntax

 x* matches zero or more x’s

“a*” matches ’’, ’a’, ’aa’, etc.
 x+ matches one or more x’s
“a+” matches ’a’,’aa’,’aaa’, etc.
 x? matches zero or one x’s
“a?” matches ’’ or ’a’ .
 x{m, n} matches i x‘s, where m<i< n
“a{2,3}” matches ’aa’ or ’aaa’
Regular Expression Syntax
 “\d” matches any digit; “\D” matches any non-
digit
 “\s” matches any whitespace character; “\S”
matches any non-whitespace character
 “\w” matches any alphanumeric character; “\
W” matches any non-alphanumeric character
 “^” matches the beginning of the string; “$”
matches the end of the string
 “\b” matches a word boundary; “\B” matches
position that is not a word boundary
Search and Match
 The two basic functions are re.search and
re.match
• Search looks for a pattern anywhere in a string
• Match looks for a match staring at the beginning
 Both return None if the pattern is not found
(logical false) and a “match object” if it is
>>> pat = "a*b"
>>> import re
>>> re.search(pat,"fooaaabcde")
<_sre.SRE_Match object at 0x809c0>
>>> re.match(pat,"fooaaabcde")
>>>
Q: What’s a match object?
 A: an instance of the match class with the
details of the match result
pat = "a*b"
>>> r1 = re.search(pat,"fooaaabcde")
>>> r1.group() # group returns string matched
'aaab'
>>> r1.start() # index of the match start
3
>>> r1.end() # index of the match end
7
>>> r1.span() # tuple of (start, end)
(3, 7)
What got matched?
 Here’s a pattern to match simple email
addresses
\w+@(\w+\.)+(com|org|net|edu)

>>> pat1 = "\w+@(\w+\.)+(com|org|net|edu)"

>>> r1 = re.match(pat,"finin@cs.umbc.edu")
>>> r1.group()
'finin@cs.umbc.edu’

 We might want to extract the pattern parts, like

the email name and host
What got matched?
 We can put parentheses around groups we
want to be able to reference
>>> pat2 = "(\w+)@((\w+\.)+(com|org|net|edu))"
>>> r2 = re.match(pat2,"finin@cs.umbc.edu")
>>> r2.group(1)
'finin'
>>> r2.group(2)
'cs.umbc.edu'
>>> r2.groups()
r2.groups()
('finin', 'cs.umbc.edu', 'umbc.', 'edu’)
 Note that the ‘groups’ are numbered in a
preorder traversal of the forest
What got matched?
 We can ‘label’ the groups as well…
>>> pat3 ="(?P<name>\w+)@(?P<host>(\w+\.)+(com|
org|net|edu))"
>>> r3 = re.match(pat3,"finin@cs.umbc.edu")
>>> r3.group('name')
'finin'
>>> r3.group('host')
'cs.umbc.edu’
 And reference the matching parts by the
labels
More re functions
 re.split() is like split but can use patterns
>>> re.split("\W+", “This... is a test,
short and sweet, of split().”)
['This', 'is', 'a', 'test', 'short’,
'and', 'sweet', 'of', 'split’, ‘’]
 re.sub substitutes one string for a pattern
>>> re.sub('(blue|white|red)', 'black', 'blue
socks and red shoes')
'black socks and black shoes’
 re.findall() finds al matches
>>> re.findall("\d+”,"12 dogs,11 cats, 1 egg")
['12', '11', ’1’]
Compiling regular expressions
 If you plan to use a re pattern more than once,
compile it to a re object
 Python produces a special data structure that
speeds up matching
>>> capt3 = re.compile(pat3)
>>> cpat3
<_sre.SRE_Pattern object at 0x2d9c0>
>>> r3 = cpat3.search("finin@cs.umbc.edu")
>>> r3
<_sre.SRE_Match object at 0x895a0>
>>> r3.group()
'finin@cs.umbc.edu'
Pattern object methods
 There are methods defined for a pattern object
that parallel the regular expression functions,
e.g.,
• match
• search
• split
• findall
• sub
Example: pig latin

 Rules
• If word starts with consonant(s)
— Move them to the end, append “ay”
• Else word starts with vowel(s)
— Keep as is, but add “zay”
• How might we do this?
The pattern

([bcdfghjklmnpqrstvwxyz]+)(\w+)
piglatin.py

import re
pat = ‘([bcdfghjklmnpqrstvwxyz]+)(\w+)’
cpat = re.compile(pat)

def piglatin(string):
return " ".join( [piglatin1(w) for w in string.split()] )
piglatin.py

def piglatin1(word):
match = cpat.match(word)
if match:
consonants = match.group(1)
rest = match.group(2)
return rest + consonents + “ay”
else:
return word + "zay"

Matrix of Destiny Numerology Calculation Free From Kabala
0% (1)
Matrix of Destiny Numerology Calculation Free From Kabala
1 page
Developing The Four Essential Skills
No ratings yet
Developing The Four Essential Skills
2 pages
RegEx 1
No ratings yet
RegEx 1
48 pages
Regular Expressions: Regular Expression Syntax in Python
No ratings yet
Regular Expressions: Regular Expression Syntax in Python
11 pages
13B RegExp
No ratings yet
13B RegExp
38 pages
Lecture 6 Re Basics
No ratings yet
Lecture 6 Re Basics
12 pages
Js 2
No ratings yet
Js 2
28 pages
css unit 5 dev notes
No ratings yet
css unit 5 dev notes
13 pages
Using Regular Expressions With PHP
No ratings yet
Using Regular Expressions With PHP
6 pages
Python 201 - (Slightly) Advanced Python Topics
No ratings yet
Python 201 - (Slightly) Advanced Python Topics
69 pages
Lec 7
No ratings yet
Lec 7
26 pages
R Programming
No ratings yet
R Programming
37 pages
Raunakmalkani 20BIT032 Assignment RegularExpressions
No ratings yet
Raunakmalkani 20BIT032 Assignment RegularExpressions
14 pages
Regexp in TCL
No ratings yet
Regexp in TCL
4 pages
21 Regular Expressions PR TM
No ratings yet
21 Regular Expressions PR TM
14 pages
Regular Expression 01
No ratings yet
Regular Expression 01
48 pages
8 Regular Expressions (E Next - In)
No ratings yet
8 Regular Expressions (E Next - In)
3 pages
Regular Expression
No ratings yet
Regular Expression
18 pages
3.2 String Functions
No ratings yet
3.2 String Functions
20 pages
UNIT - 4 REGEX
No ratings yet
UNIT - 4 REGEX
28 pages
03 Regular Expressions and Grammars Parser Generators 16102023 041542pm
No ratings yet
03 Regular Expressions and Grammars Parser Generators 16102023 041542pm
32 pages
11 PHP Formvalidation
No ratings yet
11 PHP Formvalidation
24 pages
Web Based Application Development With PHP: WBP (22619) Lect 8
No ratings yet
Web Based Application Development With PHP: WBP (22619) Lect 8
19 pages
Preg Match
No ratings yet
Preg Match
21 pages
PHP Unit II Notes
No ratings yet
PHP Unit II Notes
10 pages
RegEx in Python (4)
No ratings yet
RegEx in Python (4)
6 pages
String Manipulation and Regular Expressions
No ratings yet
String Manipulation and Regular Expressions
40 pages
PHP String and Regular Expressions
No ratings yet
PHP String and Regular Expressions
40 pages
Regular Expressions
No ratings yet
Regular Expressions
9 pages
Nuevo Documento de Texto
No ratings yet
Nuevo Documento de Texto
6 pages
Week 6 16102023 102213am 11032024 091701am
No ratings yet
Week 6 16102023 102213am 11032024 091701am
47 pages
Cheats Hee Ten
No ratings yet
Cheats Hee Ten
14 pages
String R
No ratings yet
String R
6 pages
Lecture 7 Re Part2 Split
No ratings yet
Lecture 7 Re Part2 Split
8 pages
02 Strings
No ratings yet
02 Strings
22 pages
python-notes-class-xi-cs-083_removed (1) (1)
No ratings yet
python-notes-class-xi-cs-083_removed (1) (1)
25 pages
Regular Expression l
No ratings yet
Regular Expression l
20 pages
Regular Expressions: Python For Everybody
No ratings yet
Regular Expressions: Python For Everybody
34 pages
Sequence Types: Tuples, Lists, and Strings
No ratings yet
Sequence Types: Tuples, Lists, and Strings
40 pages
Pattern Matching: Syntax in Functions
No ratings yet
Pattern Matching: Syntax in Functions
4 pages
Practical No 12
No ratings yet
Practical No 12
6 pages
Regular Expressions Tutorial: Visit
No ratings yet
Regular Expressions Tutorial: Visit
31 pages
9.RegEx (1)
No ratings yet
9.RegEx (1)
57 pages
Regex (1)
No ratings yet
Regex (1)
6 pages
05 String - Python
No ratings yet
05 String - Python
30 pages
R program Lab manual
No ratings yet
R program Lab manual
46 pages
Unit - 2 Sequence Data Types and OOP
No ratings yet
Unit - 2 Sequence Data Types and OOP
84 pages
css unit 5
No ratings yet
css unit 5
15 pages
12 Lecture Python Strings - Part 2
No ratings yet
12 Lecture Python Strings - Part 2
20 pages
Regex Notes
No ratings yet
Regex Notes
2 pages
2013 - Notes - R Trinker'S - Notes
No ratings yet
2013 - Notes - R Trinker'S - Notes
274 pages
Formatting Text With Javascript
No ratings yet
Formatting Text With Javascript
1 page
Regular Expression
No ratings yet
Regular Expression
21 pages
Algo Questions
No ratings yet
Algo Questions
19 pages
Microsoft Beefs Up VBScript With Regular Expressions
No ratings yet
Microsoft Beefs Up VBScript With Regular Expressions
10 pages
Chapter 4 - Python String
No ratings yet
Chapter 4 - Python String
29 pages
Supplement Python Regular Expression
No ratings yet
Supplement Python Regular Expression
6 pages
Chapter - 11 - Regular Expressions
100% (1)
Chapter - 11 - Regular Expressions
10 pages
Python Imp 3
No ratings yet
Python Imp 3
10 pages
R Programming Solvw
No ratings yet
R Programming Solvw
45 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
Interviewing Applicants For Refugee Status (RLD 4) : Why This Module?
No ratings yet
Interviewing Applicants For Refugee Status (RLD 4) : Why This Module?
68 pages
Uncertainty Budget Template
No ratings yet
Uncertainty Budget Template
4 pages
TCS34725 Color Sensor User Manual
No ratings yet
TCS34725 Color Sensor User Manual
16 pages
State Pollution Control Board-Mishra-Sahu
No ratings yet
State Pollution Control Board-Mishra-Sahu
19 pages
Design For Subjective Well-Being in Interior Architecture: Petermans, Ann
No ratings yet
Design For Subjective Well-Being in Interior Architecture: Petermans, Ann
13 pages
Database Management Reference Manual
No ratings yet
Database Management Reference Manual
196 pages
Political Science MCQs With Explanation For CSS (Machiavelli)
100% (1)
Political Science MCQs With Explanation For CSS (Machiavelli)
3 pages
Physics Formula List
50% (4)
Physics Formula List
4 pages
Fabm 1 Lesson 4
67% (3)
Fabm 1 Lesson 4
3 pages
10b Sorting
No ratings yet
10b Sorting
29 pages
Collaboration Specialist Training v2 - Cisco Endpoints: About This Lab
No ratings yet
Collaboration Specialist Training v2 - Cisco Endpoints: About This Lab
48 pages
Amlodipine Besylate
No ratings yet
Amlodipine Besylate
6 pages
Data Handling Class 8
No ratings yet
Data Handling Class 8
6 pages
Marketing Information System
No ratings yet
Marketing Information System
9 pages
4 - Assignment 1 Situation Analysis
No ratings yet
4 - Assignment 1 Situation Analysis
3 pages
Women Showing Off - Notes On Female Exhibitionism PDF
No ratings yet
Women Showing Off - Notes On Female Exhibitionism PDF
25 pages
Divya Sachdeva: Mobile No. - 7073809004
No ratings yet
Divya Sachdeva: Mobile No. - 7073809004
2 pages
Demo Teaching Rubric
100% (1)
Demo Teaching Rubric
2 pages
Kruthika CV
No ratings yet
Kruthika CV
4 pages
TDD With Django
No ratings yet
TDD With Django
27 pages
The SSA Book
50% (4)
The SSA Book
389 pages
DLP For Cot (Imperatives) To Print
No ratings yet
DLP For Cot (Imperatives) To Print
3 pages
Summer Training Report
No ratings yet
Summer Training Report
32 pages
Bio-Biomedical Graduate Programs Dropping GRE Requirement
No ratings yet
Bio-Biomedical Graduate Programs Dropping GRE Requirement
2 pages
MATH 231-Statistics-Hira Nadeem PDF
No ratings yet
MATH 231-Statistics-Hira Nadeem PDF
3 pages
Poetry Lesson Plan
0% (1)
Poetry Lesson Plan
5 pages
Jurisprudence-I Class Notes
100% (2)
Jurisprudence-I Class Notes
23 pages
I Am Vasundhara
No ratings yet
I Am Vasundhara
4 pages