Python Module-41
Python Module-41
Module 4
Prepared by,
Rini Kurian
Assistant Professor, MCA
AJCE
● Regular expressions: Introduction, match()
function, search() function, search and
replace, regular expression modifiers, regular
expression patterns, Character classes,
special character classes, repetition cases,
findall() method, compile() method.
A Regular Expressions (RegEx) is a special sequence of characters that uses a search pattern to find a string or
set of strings.
It can detect the presence or absence of a text by matching with a particular pattern, and also can split a pattern
into one or more sub-patterns.
Its primary function is to offer a search, where it takes a regular expression and a string. Here, it either returns
the first match or else none.
MetaChara Description
cters
Regular expression literals may include an optional modifier to control various aspects of
matching.
The modifiers are specified as an optional flag.
You can provide multiple modifiers using exclusive OR (|)
Regular Expression Patterns
Except for control characters, (+ ? . * ^ $ ( ) [ ] { } | \), all characters match themselves. You can escape a
control character by preceding it with a backslash.
1 ^
Matches beginning of line.
2 $
Matches end of line.
3 .
Matches any single character except newline. Using m option allows it to match newline as well.
4 [...]
Matches any single character in brackets.
5 [^...]
Matches any single character not in brackets
6 re*
Matches 0 or more occurrences of preceding expression.
7 re+
Matches 1 or more occurrence of preceding expression.
Sr.No. Pattern & Description
8 re?
Matches 0 or 1 occurrence of preceding expression.
9 re{ n}
Matches exactly n number of occurrences of preceding expression.
10 re{ n,}
Matches n or more occurrences of preceding expression.
11 re{ n, m}
Matches at least n and at most m occurrences of preceding expression.
12 a| b
Matches either a or b.
13 (re)
Groups regular expressions and remembers matched text.
14 (?imx)
Temporarily toggles on i, m, or x options within a regular expression. If in parentheses, only that area is
affected.
Sr.No. Pattern & Description
15 (?-imx)
Temporarily toggles off i, m, or x options within a regular expression. If in parentheses, only that area is affected.
16 (?: re)
Groups regular expressions without remembering matched text.
17 (?imx: re)
Temporarily toggles on i, m, or x options within parentheses.
18 (?-imx: re)
Temporarily toggles off i, m, or x options within parentheses.
19 (?#...)
Comment.
20 (?= re)
Specifies position using a pattern. Doesn't have a range.
21 (?! re)
Specifies position using pattern negation. Doesn't have a range.
Sr.No. Pattern & Description
22 (?> re)
Matches independent pattern without backtracking.
23 \w
Matches word characters.
24 \W
Matches nonword characters.
25 \s
Matches whitespace. Equivalent to [\t\n\r\f].
26 \S
Matches nonwhitespace.
27 \d
Matches digits. Equivalent to [0-9].
28 \D
Matches nondigits.
29 \A
Matches beginning of string.
30 \Z
Matches end of string. If a newline exists, it matches just before newline.
Sr.No. Pattern & Description
31 \z
Matches end of string.
32 \G
Matches point where last match finished.
33 \b
Matches word boundaries when outside brackets. Matches backspace (0x08) when inside
brackets.
34 \B
Matches nonword boundaries.
36 \1...\9
Matches nth grouped subexpression.
37 \10
Matches nth grouped subexpression if it matched already. Otherwise refers to the octal
representation of a character code.
Character class
The regex engine matches only one out of several characters in the character class or character set.
The order of the characters inside a character class or set does not matter. The results are identical.
Similarly for uppercase and lowercase letters we have the character class [A-Za-z]
Sr.No. Example & Description
1 [Pp]ython
Match "Python" or "python"
2 rub[ye]
Match "ruby" or "rube"
3 [aeiou]
Match any one lowercase vowel
4 [0-9]
Match any digit; same as [0123456789]
5 [a-z]
Match any lowercase ASCII letter
6 [A-Z]
Match any uppercase ASCII letter
7 [a-zA-Z0-9]
Match any of the above
8 [^aeiou]
Match anything other than a lowercase vowel
9 [^0-9]
Match anything other than a digit
Special character classes
1 .
Match any character except newline
2 \d
Match a digit: [0-9]
3 \D
Match a nondigit: [^0-9]
4 \s
Match a whitespace character: [ \t\r\n\f]
5 \S
Match nonwhitespace: [^ \t\r\n\f]
6 \w
Match a single word character: [A-Za-z0-9_]
7 \W
Match a nonword character: [^A-Za-z0-9_]
Repetition Cases
1 ruby?
Match "rub" or "ruby": the y is optional
2 ruby*
Match "rub" plus 0 or more ys
3 ruby+
Match "rub" plus 1 or more ys
4 \d{3}
Match exactly 3 digits
5 \d{3,}
Match 3 or more digits
6 \d{3,5}
Match 3, 4, or 5 digits
Regex Functions
pattern
This is the regular expression to be matched.
string
This is the string, which would be searched to match the pattern at the
beginning of string.
flags
You can specify different flags using bitwise OR (|).
The re.match function returns a match object on success, None on failure.
group(num=0)
This method returns entire match (or specific subgroup num)
groups()
This method returns all matching subgroups in a tuple (empty
if there weren't any)
The search Function
This function searches for first occurrence of RE pattern within string with optional flags
The re.search function returns a match object on success, none on failure.
Python offers two different primitive operations based on regular expressions: match checks for a match only at
the beginning of the string, while search checks for a match anywhere in the string.
Search and Replace(sub)
One of the most important re methods that use regular expressions is sub.
This method replaces all occurrences of the RE pattern in string with repl, substituting all occurrences
unless max provided. This method returns modified string.
The findall Method
The re.findall() function returns a list of strings containing all matches of the specified pattern.
The function takes as input the following:
a character pattern
the string from which to search
Example:
The following example will return a list of all the instances of the substring at in the given string:
The split() function
The re.split() function splits the string at every occurrence of the sub-string and returns a
list of strings which have been split.
Example
Suppose we wish to split a string wherever there is an occurrence of a
The compile() method
We can combine a regular expression pattern into pattern objects, which can be used for pattern
matching. It also helps to search a pattern again without rewriting it.
Syntax:
The database is a collection of organized information that can easily be used, managed, update,
and they are classified according to their organizational approach
Programming in Python is considerably simple and efficient with compared to other languages,
so as the database programming
Python database is portable, and the program is also portable so both can give an advantage in
case of portability
Python supports SQL cursors
It also supports Relational Database systems
The API of Python for the database is compatible with other databases also
It is platform-independent
Database Programming in Python
The Python programming language has powerful features for database programming. Python supports
various databases like MySQL, Oracle, Sybase, PostgreSQL, etc.
Python also supports Data Definition Language (DDL), Data Manipulation Language (DML), and Data
Query Statements.
For database programming, the Python DB API is a widely used module that provides a database
application programming interface.
DB-API (SQL-API) for Python
Python provides DB-API which is independent of any database engine and it enables you to write Python scripts to
access any database engine. The Python DB-API implementation for different databases are as follows –
You must download a separate DB API module for each database you need to access.
For example, if you need to access an Oracle database as well as a MySQL database, you must download both the
Oracle and the MySQL database modules.
The DB API provides a minimal standard for working with databases using Python structures and syntax
wherever possible. This API includes the following −
• Connection Objects
• Cursor Objects
• Standard Exceptions
• Some Other Module Contents.
Connection Objects
• Connection objects in DB-API of Python create a connection with the database which is further used for
different transactions.
• These connection objects are also used as representatives of the database session.
These objects represent a database cursor, which is used to manage the context of a fetch operation. Cursors
created from the same connection are not isolated, i.e., any changes done to the database by a cursor are
immediately visible by the other cursors. Cursors created from different connections can or can not be
isolated, depending on how the transaction support is implemented.
What is MySQLdb?
MySQLdb is an interface for connecting to a MySQL database server from Python
Before proceeding, you make sure you have MySQLdb installed on your machine. Just type the following in
your Python script and execute it −
Once a database connection is established, we are ready to create tables or records into the database
tables using execute method of the created cursor.
INSERT Operation
READ Operation on any database means to fetch some useful information from the database.
Once our database connection is established, you are ready to make a query into this database.
You can use either fetchone() method to fetch single record or fetchall() method to fetech multiple values from a
database table.
fetchone() − It fetches the next row of a query result set. A result set is an object that is returned when a cursor
object is used to query a table.
fetchall() − It fetches all the rows in a result set. If some rows have already been extracted from the result set,
then it retrieves the remaining rows from the result set.
rowcount − This is a read-only attribute and returns the number of rows that were affected by an execute()
method.
Example
Transactions are a mechanism that ensures data consistency. Transactions have the following four
properties −
• Consistency − A transaction must start in a consistent state and leave the system in a consistent state.
• Isolation − Intermediate results of a transaction are not visible outside the current transaction.
• Durability − Once a transaction was committed, the effects are persistent, even after a system failure.
The Python DB API 2.0 provides two methods to either commit or rollback a transaction.
COMMIT Operation
Commit is the operation, which gives a green signal to database to finalize the
changes, and after this operation, no change can be reverted back.
If you are not satisfied with one or more of the changes and you want to revert back those changes
completely, then use rollback() method.
Disconnecting Database
If the connection to a database is closed by the user with the close() method, any outstanding transactions are
rolled back by the DB. However, instead of depending on any of DB lower level implementation details, your
application would be better off calling commit or rollback explicitly.
Handling Errors
There are many sources of errors. A few examples are a syntax error in an executed SQL statement, a connection
failure, or calling the fetch method for an already canceled or finished statement handle.
The DB API defines a number of errors that must exist in each database module. Your Python scripts should handle
these errors, but before using any of the above exceptions, make sure your MySQLdb has support for that
exception.
Sr.N Exception & Description
o.
1 Warning
Used for non-fatal issues. Must subclass StandardError.
2 Error
Base class for errors. Must subclass StandardError.
3 InterfaceError
Used for errors in the database module, not the database itself. Must subclass Error.
4 DatabaseError
Used for errors in the database. Must subclass Error.
Sr.No Exception & Description
.
5 DataError
Subclass of DatabaseError that refers to errors in the data.
6 OperationalError
Subclass of DatabaseError that refers to errors such as the loss of a connection to the database.
These errors are generally outside of the control of the Python scripter.
7 IntegrityError
Subclass of DatabaseError for situations that would damage the relational integrity, such as
uniqueness constraints or foreign keys.
8 InternalError
Subclass of DatabaseError that refers to errors internal to the database module, such as a
cursor no longer being active.
9 ProgrammingError
Subclass of DatabaseError that refers to errors such as a bad table name and other things that
can safely be blamed on you.
10 NotSupportedError
Subclass of DatabaseError that refers to trying to call unsupported functionality.
MySQLdb has MySQLdb.Error exception, a top