Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
22 views

Python Unit-2 Notes

Uploaded by

Prasad Dhumale
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Python Unit-2 Notes

Uploaded by

Prasad Dhumale
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 60

UNIT-2

STRINGS:
A string can be defined as set of characters enclosed within pair of
single quotes ' ' or double quotes " " or triple quotes ''' '''.

Any symbol which is used while writing a program is called


character. For example, the English language has 26 characters.

Computers do not deal with characters; they deal with numbers


(binary). Even though you may see characters on your screen,
internally it is stored and manipulated as a combination of 0s and 1s.

This conversion of character to a number is called encoding, and


the reverse process is decoding. ASCII and Unicode are some of the
popular encodings used.

In Python, a string is a sequence of Unicode characters. Unicode was


introduced to include every character in all languages and bring
uniformity in encoding.
Strings are immutable. This means that once defined, they cannot
be changed.

About Unicode

Today’s programs need to be able to handle a wide variety of


characters. Applications are often internationalized to display
messages and output in a variety of user-selectable languages; the
same program might need to output an error message in English,
French, Japanese, Hebrew, or Russian. Web content can be written in
any of these languages and can also include a variety of emoji
symbols. Python’s string type uses the Unicode Standard for
representing characters, which lets Python programs work with all
these different possible characters.

Unicode is a specification that aims to list every character used by


human languages and give each character its own unique code. The
Unicode specifications are continually revised and updated to add
new languages and symbols.

A character is the smallest possible component of a text. ‘A’, ‘B’, ‘C’,


etc., are all different characters. So are ‘È’ and ‘Í’. Characters vary
depending on the language or context you’re talking about. For
example, there’s a character for “Roman Numeral One”, ‘Ⅰ’, that’s
separate from the uppercase letter ‘I’. They’ll usually look the same,
but these are two different characters that have different meanings.

The Unicode standard describes how characters are represented


by code points. A code point value is an integer in the range 0 to
0x10FFFF (about 1.1 million values, the actual number assigned is
less than that).

A Unicode string is a sequence of code points, which


are numbers from 0 through 0x10FFFF (1,114,111
decimal). This sequence of code points needs to be represented in
memory as a set of code units, and code units are then mapped to 8-
bit bytes.
UTF-8 is one of the most commonly used encodings, and Python
often defaults to using it. UTF stands for “Unicode Transformation
Format”, and the ‘8’ means that 8-bit values are used in the
encoding. (There are also UTF-16 and UTF-32 encodings, but they are
less frequently used than UTF-8.)

UTF-8 is recommended practice for encoding data to be exchanged


between systems.
UTF-8 is upward compatible with ASCII.
In Python3, all strings internally are UNICODE.

chr() function
The chr() method returns a character (a string) from an integer
(represents unicode code point of the character).

The syntax of chr() is:

chr(i)

chr() Parameters

chr() method takes a single parameter, an integer i.


The valid range of the integer is from 0 through 1,114,111.

Return Value from chr()

chr() returns:
 a character (a string) whose Unicode code point is the integer i
If the integer i is outside the range, ValueError will be raised.

Example 1: How chr() works?

print(chr(97))
print(chr(65))
print(chr(1200))

Output

a
A
Ұ

Example 2: Integer passed to chr() is out of the range

print(chr(-1))

Output

Traceback (most recent call last):

File "", line 1, in

ValueError: chr() arg not in range(0x110000)

When you run the program, ValueError is raised.


It's because the argument passed to chr() method is out of the
range.
The reverse operation of chr() function can be performed
by ord() function

ord() function
The ord() function returns an integer representing the Unicode
character.

The syntax of ord() is:

ord(ch)

ord() Parameters

The ord() function takes a single parameter:


 ch - a Unicode character

Return value from ord()

The ord() function returns an integer representing the Unicode


character.
Example: How ord() works in Python?

print(ord('5')) # 53
print(ord('A')) # 65
print(ord('$')) # 36

Output

53
65
36

Creating strings in Python


Strings can be created by enclosing characters inside a single quote
or double-quotes. Even triple quotes can be used in Python but
generally used to represent multiline strings and docstrings.

# defining strings in Python


# all of the following are equivalent
my_string = 'Hello'
print(my_string)

my_string = "Hello"
print(my_string)

my_string = '''Hello'''
print(my_string)
# triple quotes string can extend multiple lines
my_string = '''Hello, welcome to
the world of Python'''
print(my_string)

Reading and Converting strings


 We prefer to read data in using strings and then parse and
convert the data as we need.
 This gives us more control over error situations and/or bad user
input.
 Raw input numbers must be converted from strings.

Let us consider an example:

n=input("Enter the number")

In the above example, we have not specified any data type, in Python
even though we do not specify any data type by default it will be
considered as string. Whenever we enter the data, by default it will
be converted to string.

Representation or Accessing the string contents

We can get at any single character in a string using an index specified


in square brackets.
The index value must be an integer and starts at zero.
The index value can be an expression that is computed.

Example:
Let us take the sting as
str1='REVA' # Creating a string

Representation of string:

R E V A
0 1 2 3

Write a python program to read, print a string and print a


character present at specific index
str1=str(input("Enter a string"))
print("String entered=", str1)
print("Character present at first index is", str1[1])

You will get a python error if you attempt to index beyond


the end of a string.

So be careful when constructing index values and slices.

There is a built-in function len( ) that gives us the length of a string.


Example:

str1='REVA'

print(len(str1)

Output:

Slicing a string
Slicing in Python is a feature that enables accessing parts of
sequences like strings, tuples, and lists. You can also use them to
modify or delete the items of mutable sequences such as lists.

 We can also look at any continuous section of a string using a


colon operator.
 The second number is one beyond the end of the slice -“up to
but not including”.
 If the second number is beyond the end of the string, it stops at
the end.
 If we leave off the first number or the last number of the slice,
it is assumed to be the beginning or end of the string
respectively.
 The indexes can also be negative numbers in reverse order.
Write a python program to extract the substrings 'Hi'
'Welcome' 'REVA' 'UNIVERSITY' from the following string
str1='Hi, Welcome to REVA UNIVERSITY'

Program:

str1='Hi, Welcome to REVA UNIVERSITY'


print("Sub string 1 is", str1[0:2])
print("Sub string 1 is", str1[4:11])
print("Sub string 1 is", str1[15:20])
print("Sub string 1 is", str1[20:])

A substring "substr" between index1 and index2 is to be


extracted from the given input string, "str1", which is read
using input() and display the substring "substr" using a user
defined function.

str1=str(input("Enter the string"))


def slicing():
m=int(input("Enter the starting index value or index1"))
n=int(input("Enter the ending index value or index2"))
substr=str1[m:n]
print("String extracted between index1 and index2 is",substr)
slicing()
String containing multiple words is to be read from the user
one at a time and
i) convert all the strings to uppercase and

str1=str(input('Enter the string'))


print('String in Upper case is',str1.upper())

ii) split the words of a string using space as the separation


character.

str1=str(input('Enter the string'))


print("string with space as the separation character is",str1.split(' '))

Changing the string contents


In python, strings are immutable, once created, we cannot alter the
contents of a string.

Example:

my_string = 'REVA'
my_string[3] = 'a'

TypeError: 'str' object does not support item assignment

Deleting the content of a string

Example:
my_string = 'REVA'
del my_string[1]

TypeError: 'str' object doesn't support item deletion

Example:

del my_string
my_string

NameError: name 'my_string' is not defined

Concatenation of two or more strings


We can concatenate 2 strings by using ‘+’ operator.

Write a python program to read and find the concatenate of


two strings
s1=str(input("Enter first string"))
s2=str(input("Enter second string"))
s3=s1+s2
print("Concatenated string is",s3)

Concatenation using braces


We can use the small or round brackets to concatenate 2 or more
strings.

Example:
s=('Hi '
' REVA'
' UNIVERSITY')
print(s)

Output:
Hi REVA UNIVERSITY

Iterating through string

Example:
str1='SHIVA KUMAR'
for i in str1:
print(i)

Using in as an operator

The in keyword can also be used to check to see if one string


is "in" another string.
The in expression is a logical expression and returns True or False
and can be used in an if statement.

Example:
str1='REVA'
print('E' in str1)
print('e' in str1)
print('REV' in str1)
str2='ECE'
if 'EC' in str2:
print('Found the String')
else:
print('Not Found')

String Comparison
To compare two strings, we will compare each and every
character in both the strings. If all the characters in both the
strings are same, then only it is possible for us to tell that
both the strings are same. If there is any mismatching
character, then difference between the Unicode values of
mismatching characters in both the strings will be calculated.
Based on the difference value, we will tell whether string1 is
greater or string2 is greater.

Write a python program to compare two strings


str1='CSE'
str2='ECE'
if str1 > str2:
print('str1 is greater than str2')
elif str1 < str2:
print('str2 is greater than str1')
else:
print('str1 is equal to str2')

String in-built or built-in Functions/Methods:

Capitalize()
The function is used to convert the first character of a word
into uppercase character.

Example:
str1='reva University'
str1.capitalize()

Output:
Reva University

lower()
The function is used to convert the entire string into lower
case.

Example:
str1='ELECTRONICS'
str1.lower()

Output:
electronics

upper()
The function is used to convert the entire string into Upper
case.

Example:
str1='electronics'
str1.upper()

Output:
ELECTRONICS

center()
The center() method takes two arguments:

.width - length of the string with padded characters


.fillchar (optional) - padding character
The fillchar argument is optional. If it's not provided, space is
taken as default argument.

The center() method returns a string padded with specified


fillchar. It doesn't modify the original string.

Example:
str1='REVA'
str1.center(10)

Output:
' REVA '

Example:
str1='REVA'
str1.center(10,'*')

Output:
'***REVA***'

count()
The function is used to count the number of occurences of a
character or set of characters.

Example:
str1='REVA UNIVERSITY'
str1.count('E')

Output:
2

Example:
str1='REVA UNIVERSITY'
str1.count('REVA')

Output:
1

find()
The function returns an integer value:

.If the substring exists inside the string, it returns the index of
the first occurrence of the substring.
.If substring doesn't exist inside the string, it returns -1.

Example:
str1='REVA UNIVERSITY'
str1.find('A')

Output:
3

Example:
str1='REVA UNIVERSITY'
str1.find('UNI')

Output:
5

rfind()
The function prints the last occurence of the character.
Example:
str1='REVA UNIVERSITY'
str1.find('E')

Output:
9

strip()
The function returns a copy of the string by removing both
the leading and the trailing characters.

Example:
str1=' REVA UNIVERSITY '
str1.strip()

Output:
'REVA UNIVERSITY'

lstrip()
The function returns a copy of the string with leading
characters removed.
Example:
str1=' REVA UNIVERSITY '
str1.lstrip()

Output:
'REVA UNIVERSITY '

rstrip()
The function returns a copy of the string with with trailing
characters removed.

Example:
str1=' REVA UNIVERSITY '
str1.rstrip()

Output:
' REVA UNIVERSITY'

replace()
The replace() method returns a copy of the string where all
occurrences of a substring is replaced with another substring.

Example:
str1='reva UNIVERSITY'
str1.replace('reva','REVA')

Output:
'REVA UNIVERSITY'

title()
The title() method returns a string with first letter of each
word capitalized; a title cased string.

Example:
str1='reva university'
str1.title()

Output:
'Reva University'

split()
The split() method breaks up a string at the specified
separator and returns a list of strings. The string splits at the
specified separator.
If the separator is not specified, any whitespace (space,
newline etc.) string is a separator.

Example:
str1='REVA UNIVERSITY'
str1.split()

Output:
['REVA', 'UNIVERSITY']

Example:
str1='COMPUTER SCIENCE'
str1.split('E')

Output:
['COMPUT', 'R SCI', 'NC', '']

isalpha()
The isalpha() method returns True if all characters in the
string are alphabets. If not, it returns False.
Example:
str1='REVA UNIVERSITY'
str1.isalpha()

Output:
True

isalnum()
The isalnum() method returns True if all characters in the
string are alphanumeric (either alphabets or numbers). If not,
it returns False.

Example:
str1='CS 1 ECE 2'
str1.isalnum()

Output:
True

islower()
The islower() method returns True if all alphabets in a string
are lowercase alphabets. If the string contains at least one
uppercase alphabet, it returns False.

Example:
str1='cse'
str1.islower()

Output:
True

isupper()
The isupper() method returns True if all alphabets in a string
are uppercase alphabets. If the string contains at least one
lowercase alphabet, it returns False.

Example:
str1='CSE'
str1.isupper()

Output:
True
isdigit()
The isdigit() method returns True if all characters in a string
are digits. If not, it returns False.

Example:
str1='2021'
str1.isdigit()

Output:
True

startswith()
The startswith() method returns True if a string starts with
the specified prefix(string). If not, it returns False.

Example:
str1='REVA UNIVERSITY'
str1.startswith('R')

Output:
True
endswith()
The endswith() method returns True if a string ends with the
specified prefix(string). If not, it returns False.

Example:
str1='REVA UNIVERSITY'
str1.startswith('Y')

Output:
True

casefold()
The casefold() method is an aggressive lower() method which
converts strings to case folded strings for caseless matching.
The casefold() method removes all case distinctions present
in a string. It is used for caseless matching, i.e. ignores cases
when comparing.

Example:
str1='REVA'
str1.casefold()

Output:
reva

index()
The index() method returns the index of a substring inside
the string (if found). If the substring is not found, it raises an
exception.
The index() method is similar to find() method for strings. The
only difference is that find() method returns -1 if the
substring is not found, whereas index() throws an exception.

Example:
str1='REVA'
str1.index('R')

Output:
0

Regular Expressions
A Regular Expression (RegEx) is a sequence of characters that defines
a search pattern. For example,

^a...s$
The above code defines a RegEx pattern. The pattern is: any five
letter string starting with a and ending with s.
A pattern defined using RegEx can be used to match against a string.

Expression String Matched?

abs No match

alias Match

^a...s$ abyss Match

Alias No match

An abacus No match

Specify Pattern Using RegEx


To specify regular expressions, metacharacters are used. In the
above example, ^ and $ are metacharacters.

MetaCharacters
Metacharacters are characters that are interpreted in a special way
by a RegEx engine. Here's a list of metacharacters:

[] . ^ $ * + ? {} () \ |
[] - Square brackets
Square brackets specify a set of characters you wish to match.

Expression String Matched?

a 1 match

ac 2 matches
[abc]
Hey Jude No match

abc de ca 5 matches

Here, [abc] will match if the string you are trying to match contains
any of the a, b or c.
You can also specify a range of characters using - inside square
brackets.
 [a-e] is the same as [abcde].

 [1-4] is the same as [1234].

 [0-39] is the same as [01239].

You can complement (invert) the character set by using


caret ^ symbol at the start of a square-bracket.
 [^abc] means any character except a or b or c.

 [^0-9] means any non-digit character.

. - Period
A period matches any single character (except newline '\n').
Expression String Matched?

a No match

ac 1 match
..
acd 1 match

acde 2 matches (contains 4 characters)

^ - Caret
The caret symbol ^ is used to check if a string starts with a certain
character.
Expressio
String Matched?
n

a 1 match

^a abc 1 match

bac No match

^ab abc 1 match


Expressio
String Matched?
n

No match (starts with a but not


acb
followed by b)

$ - Dollar
The dollar symbol $ is used to check if a string ends with a certain
character.
Expression String Matched?

a 1 match

a$ formula 1 match

cab No match

* - Star
The star symbol * matches zero or more occurrences of the pattern
left to it.
Expression String Matched?

mn 1 match

man 1 match

ma*n maaan 1 match

main No match (a is not followed by n)

woman 1 match

+ - Plus
The plus symbol + matches one or more occurrences of the pattern
left to it.
Expression String Matched?

ma+n mn No match (no a character)

man 1 match

maaan 1 match
Expression String Matched?

main No match (a is not followed by n)

woman 1 match

? - Question Mark
The question mark symbol ? matches zero or one occurrence of the
pattern left to it.
Expression String Matched?

mn 1 match

man 1 match

No match (more than one a


maaan
ma?n character)

main No match (a is not followed by n)

woma
1 match
n
{} - Braces
Consider this code: {n,m}. This means at least n, and at
most m repetitions of the pattern left to it.
Expression String Matched?

abc dat No match

abc daat 1 match (at daat)


a{2,3}
aabc daaat 2 matches (at aabc and daaat)

aabc daaaat 2 matches (at aabc and daaaat)

Let's try one more example. This RegEx [0-9]{2, 4} matches at least 2
digits but not more than 4 digits
Expression String Matched?

1 match (match at
ab123csde
ab123csde)

[0-9]{2,4} 12 and
3 matches (12, 3456, 73)
345673

1 and 2 No match

| - Alternation
Vertical bar | is used for alternation (or operator).
Expression String Matched?

cde No match

a|b ade 1 match (match at ade)

acdbea 3 matches (at acdbea)

Here, a|b match any string that contains either a or b

() - Group
Parentheses () is used to group sub-patterns. For example, (a|b|
c)xz match any string that matches either a or b or c followed by xz
Expression String Matched?

ab xz No match

(a|b|c)xz abxz 1 match (match at abxz)

axz cabxz 2 matches (at axzbc cabxz)


\ - Backslash
Backlash \ is used to escape various characters including all
metacharacters. For example,
\$a match if a string contains $ followed by a. Here, $ is not
interpreted by a RegEx engine in a special way.
If you are unsure if a character has special meaning or not, you can
put \ in front of it. This makes sure the character is not treated in a
special way.

Special Sequences
Special sequences make commonly used patterns easier to write.
Here's a list of special sequences:

\A - Matches if the specified characters are at the start of a string.


Expression String Matched?

the sun Match


\Athe
In the sun No match

\b - Matches if the specified characters are at the beginning or end


of a word.
Expression String Matched?

football Match

\bfoo a football Match

afootball No match

the foo Match

foo\b the afoo test Match

the afootest No match

\B - Opposite of \b. Matches if the specified characters are not at


the beginning or end of a word.
Expression String Matched?

football No match

\Bfoo a football No match

afootball Match

the foo No match

foo\B the afoo test No match

the afootest Match


\d - Matches any decimal digit. Equivalent to [0-9]
Expression String Matched?

12abc3 3 matches (at 12abc3)


\d
Python No match

\D - Matches any non-decimal digit. Equivalent to [^0-9]


Expression String Matched?

1ab34"50 3 matches (at 1ab34"50)


\D
1345 No match

\s - Matches where a string contains any whitespace character.


Equivalent to [ \t\n\r\f\v].
Expression String Matched?

Python RegEx 1 match


\s
PythonRegEx No match
\S - Matches where a string contains any non-whitespace character.
Equivalent to [^ \t\n\r\f\v].
Expression String Matched?

ab 2 matches (at a b)
\S
No match

\w - Matches any alphanumeric character (digits and alphabets).


Equivalent to [a-zA-Z0-9_]. By the way, underscore _ is also
considered an alphanumeric character.
Expression String Matched?

12&": ;c 3 matches (at 12&": ;c)


\w
%"> ! No match

\W - Matches any non-alphanumeric character. Equivalent to [^a-zA-


Z0-9_]
Expression String Matched?

1a2%c 1 match (at 1a2%c)


\W
Python No match

\Z - Matches if the specified characters are at the end of a string.


Expression String Matched?

I like Python 1 match

Python\Z I like Python Programming No match

Python is fun. No match

Now we understood the basics of RegEx, let's discuss how to use


RegEx in your Python code.

Python RegEx
Python has a module named re to work with regular expressions. To
use it, we need to import the module.

import re

The module defines several functions and constants to work with


RegEx.
re.search()
The re.search() method takes two arguments: a pattern and a string.
The method looks for the first location where the RegEx pattern
produces a match with the string.
If the search is successful, re.search() returns a match object; if not,
it returns None.

Syntax of the function:

s = re.search(pattern, str)

Write a python program to perform the searching process


or pattern matching using search() function.

import re

string = "Python is fun"

s = re.search('Python', string)

if s:
print("pattern found inside the string")
else:
print("pattern not found")

Here, s contains a match object.


s.start(), s.end() and s.span()
The start() function returns the index of the start of the matched
substring. Similarly, end() returns the end index of the matched
substring. The span() function returns a tuple containing start and
end index of the matched part.

>>> s.start()
0
>>> s.end()
6
>>> s.span()
(0, 6)
>>> s.group()
‘Python’

re.match()

The re.match() method takes two arguments: a pattern and a string.


If the pattern is found at the start of the string, then the method
returns a match object. If not, it returns None.

Write a python program to perform the searching process


or pattern matching using match() function.

import re
pattern = '^a...s$'
test_string = 'abyss'
result = re.match(pattern, test_string)

if result:
print("Search successful.")
else:
print("Search unsuccessful.")

Here, we used re.match() function to search pattern within


the test_string.

re.sub()

The syntax of re.sub() is:

re.sub(pattern, replace, string)


The method returns a string where matched occurrences are
replaced with the content of replace variable.

If the pattern is not found, re.sub() returns the original string.

You can pass count as a fourth parameter to the re.sub() method. If


omited, it results to 0. This will replace all occurrences.

Example1:

re.sub('^a','b','aaa')

Output:
'baa'

Example2:

s=re.sub('a','b','aaa')

print(s)

Output:

‘bbb’

Example3:

s=re.sub('a','b','aaa',2)

print(s)

Output:

‘bba’

re.subn()
The re.subn() is similar to re.sub() expect it returns a tuple of 2 items
containing the new string and the number of substitutions made.

Example1:

s=re.subn('a','b','aaa')
print(s)

Output:

(‘bbb’, 3)

re.findall()
The re.findall() method returns a list of strings containing all
matches.

If the pattern is not found, re.findall() returns an empty list.

Syntax:

re.findall(pattern, string)

Example1:

s=re.findall('a','abab')

print(s)

Output:

['a', 'a']

re.split()
The re.split method splits the string where there is a match and
returns a list of strings where the splits have occurred.

If the pattern is not found, re.split() returns a list containing the


original string.

You can pass maxsplit argument to the re.split() method. It's the
maximum number of splits that will occur.

By the way, the default value of maxsplit is 0; meaning all possible


splits.

Syntax:

re.split(pattern, string)

Example1:

s=re.split('a','abab')

print(s)

Output:

['', 'b', 'b']

Example2:

s=re.split('a','aababa',3)

print(s)
Output:

['', '', 'b', 'ba']

Python program to check that a string contains only a


certain set of characters (in this case a-z, A-Z and 0-9).

import re

pattern='\w+'

s1='shiva'

s2='sachin1'

s3='virat2'

a=re.search(pattern,s1)

b=re.search(pattern,s2)

c=re.search(pattern,s3)
print(a)

print(b)

print(c)

Python program to verify the Phone number using Regular


Expressions.
import re

pattern='(0|91)?[6-9][0-9]{9}'

p1='9731822325'

a=re.search(pattern,p1)

if a:

print("Search is successful")

else:

print("Search is unsuccessful")

Python program to extract email addresses using regular


expressions in Python (in this case john_123@gmail.com).

import re

pattern='(\w)+@(\w)+\.(com)'

email='john_123@gmail.com'

s1=re.search(pattern,email)

if s1:

print("Search is successful")

else:
print("Unsuccessful")

CASE STUDY
Street Addresses: In this case study, we will take one street address
as input and try to perform some operations on the input by making
use of library functions.

Example:

str1='100 NORTH MAIN ROAD'

str1.replace('ROAD','RD')

Output:

'100 NORTH MAIN RD'

str1.replace('NORTH','NRTH')

Output:

'100 NRTH MAIN ROAD'

re.sub('ROAD','RD',str1)

Output:
'100 NORTH MAIN RD'

re.sub('NORTH','NRTH',str1)

Output:

'100 NRTH MAIN ROAD'

re.split('A',str1)

Output:

['100 NORTH M', 'IN RO', 'D']

re.findall('O',str1)

Output:

['O', 'O']

re.sub('^1','2',str1)

Output:

'200 NORTH MAIN ROAD'

Roman Numerals

I=1

V=5
X = 10

L = 50

C = 100

D = 500

M = 1000

For writing 4, we will write the roman number representation as IV.


For 9, we will write as IX. For 40, we can write as XL. For 90, we can
write as XC. For 900, we can write as CM.

Let us write the roman number representation for few numbers.

Ex1:

1940

MCMXL

Ex2:

1946

MCMXLVI

Ex3:

1940
MCMXL

Ex4:

1888

MDCCCLXXXVIII

Checking for thousands:

1000=M

2000=MM

3000=MMM

Possible pattern is to have M in it.

Example:

pattern = '^M?M?M?$'

re.search(pattern, 'M')

Output:

<re.Match object; span=(0, 1), match='M'>

re.search(pattern, 'MM')
Output:

<re.Match object; span=(0, 2), match='MM'>

re.search(pattern, 'MMM')

Output:

<re.Match object; span=(0, 3), match='MMM'>

re.search(pattern, 'ML')

re.search(pattern, 'MX')

re.search(pattern, 'MI')

re.search(pattern, 'MMMM')

Checking for Hundreds:

100=C

200=CC

300=CCC

400=CD

500=D

600=DC

700=DCC

800=DCCC

900=CM
Example:

pattern = '^M?M?M?(CM|CD|D?C?C?C?)$'

re.search(pattern,'MCM')

Output:

<re.Match object; span=(0, 3), match='MCM'>

re.search(pattern,'MD')

Output:

<re.Match object; span=(0, 2), match='MD'>

re.search(pattern,'MMMCCC')

Output:

<re.Match object; span=(0, 6), match='MMMCCC'>


re.search(pattern,'MCMLXX')

Using the {n,m} syntax

We will check in the string, where in the pattern occurs at least


minimum ‘n’ times and at most maximum ‘m’ times.

Example:

pattern='^M{0,3}$'

re.search(pattern,'MM')

Output:

<re.Match object; span=(0, 2), match='MM'>


re.search(pattern,'M')

Output:

<re.Match object; span=(0, 1), match='M'>

re.search(pattern,'MMM')

Output:

<re.Match object; span=(0, 3), match='MMM'>

Checking for Tens and Ones:

1=I

2=II

3=III

4=IV
5=V

6=VI

7=VII

8=VIII

9=IX

10=X

20=XX

30=XXX

40=XL

50=L

60=LX

70=LXX

80=LXXX
90=XC

Example:

pattern='^M?M?M?(CM|CD|D?C?C?C?)(XC|XL|L?X?X?X?)
(IX|IV|V?I?I?I?)$'

re.search(pattern,'MDLVI')

Output:

<re.Match object; span=(0, 5), match='MDLVI'>

re.search(pattern,'MCMXLVI')

Output:

<re.Match object; span=(0, 7), match='MCMXLVI'>


re.search(pattern,'MMMCCCXLV')

Output:

<re.Match object; span=(0, 9), match='MMMCCCXLV'>

You might also like