2 - Python Strings
2 - Python Strings
Parham Kazemi
University of Isfahan, Computer Eng. Faculty
ACM Student Chapter
pkazemi3@gmail.com
CREATING STRINGS
• Create them simply by enclosing characters in quotes. Python treats single quotes the
same as double quotes.
• Python does not support a character type; these are treated as strings of length one.
>>> s3 = ‘’’this is
a string’’’
>>> print(s3)
'this is\na string'
2
SLICING AND INDEXING
• ACCESSING STRINGS
• To access substrings, use the square brackets for slicing along with the index or indices
to obtain your substring.
3
STRING FORMATTING
• FORMATTING STRINGS
• Use the .format(args) method of a string containing {arg} values:
>>> s = ‘Hello, my name is {0} and I live in {1}.’
>>> s = s.format(‘Parham’, ‘Isfahan’)
>>> print(s)
'Hello, my name is Parham and I live in Isfahan.’
• Calling a strings .count(s) function returns the number of times s occurred in the
original string.
>>> ‘Hello World!’.count(‘l’)
3
• The .replace(old, new) function returns a new string with the old string
replaced with the new one:
>>> s = ‘Boy, that escalated quickly’
>>> print(s.replace(‘Boy’, ‘Well’))
‘Well, that escalated quickly’
>>> print(s)
‘Boy, that escalated quickly’
6
USEFUL FUNCTIONS AND METHODS
• Use the .find(s) function to find the first starting index of s. (Returns -1 if not found)
>>> s = ‘To be or not to be’
>>> s.find(‘and’)
-1
>>> s.find(‘be’)
3
• The .strip(s=‘’) function returns a copy of the string with all whitespaces (or the string
given) from the beginning and end of the string removed
>>> s = ‘\t This is a string. \t\t’
>>> print(s.strip())
'This is a string.'
7
USEFUL FUNCTIONS AND METHODS
• The .split(s=‘’) function splits the string according to delimiter s (space if not
provided) and returns list of substrings.
>>> s = ‘Now this, is a string.’
>>> s.split()
['Now', 'this,', 'is', 'a', 'string.’]
>>> s.split(‘,’)
['Now this', ' is a string.’]
• The method str.join(seq) returns a string in which the string elements of the
sequence have been joined by str separator.
• This method returns a string, which is the concatenation of the strings in the sequence
seq.The separator between elements is the string providing this method.
>>> s = '-'
>>> l = ['a', 'b', 'c']
>>> s.join(l)
'a-b-c'
9
CODEFORCES 746B - DECODING
Polycarp is mad about coding, that is why he writes Sveta encoded messages. He calls
the median letter in a word the letter which is in the middle of the word. If the word's length is
even, the median letter is the left of the two middle letters. In the following examples, the
median letter is highlighted: contest, info. If the word consists of single letter, then according to
above definition this letter is the median letter.
Polycarp encodes each word in the following way: he writes down the median letter of the
word, then deletes it and repeats the process until there are no letters left. For example, he
encodes the word volga as logva.
You are given an encoding s of some word, your task is to decode it.
Input
The first line contains a positive integer n (1 ≤ n ≤ 2000) — the length of the encoded word.
The second line contains the string s of length n consisting of lowercase English letters — the
encoding.
Output
Print the word that Polycarp encoded.
10
REGULAR EXPRESSIONS
• REGEX USES:
• File Renaming
• Text Search
• Web directives
• Database queries
11
LITERAL CHARACTERS
• The most basic regular expression consists of a single literal character, such as a.
• Twelve characters (metacharacters) have special meanings in regular expressions:
\ ^ $ . | ? * + ( ) { [
• If you want to use any of these characters as a literal in a regex, you need to escape
them with a backslash: 2\+2=4
• The dot matches any single character, except line break characters.
12
CHARACTER CLASSES OR CHARACTER SETS
13
SHORTHAND CHARACTER CLASSES
14
ALTERNATION
15
QUANTIFIERS
• The question mark makes the preceding token in the regular expression optional.
• colou?r matches “colour” or “color”.
• The asterisk or star (*) tells the engine to attempt to match the preceding token zero
or more times.
• The plus sign (+) tells the engine to attempt to match the preceding token once or
more.
• Use curly braces to specify a specific amount of repetition.
• 1{3} matches “111”
• 1{2,4} matches “11”, “111”, “1111”
• 1{5,} matches “11111”, “111111”, …
16
QUANTIFIERS
• The repetition operators or quantifiers are greedy. They expand the match as far as they
can, and only give back if they must to satisfy the remainder of the regex.
• The regex <.+> matches <EM>first</EM> in This is a <EM>first</EM> test.
17
GROUPING
• RAW STRINGS
• To avoid any confusion while dealing with regular expressions, we use raw strings
as r'expression’.
>>> s1 = ‘\n’
>>> s2 = r’\n’
>>> print(s1, s2)
\n \\n
• THE re MODULE
• >>> import re
• Python offers two different primitive operations based on regular expressions:
• re.match checks for a match only at the beginning of the string,
• re.search checks for a match anywhere in the string
20
REGULAR EXPRESSIONS IN PYTHON
21
OPTION FLAGS
Modifier Description
re.I Performs case-insensitive matching
re.L Interprets words according to the current locale.
Makes $ match the end of a line (not just the end of the string) and
re.M makes ^ match the start of any line (not just the start of the string).
re.S Makes a period (dot) match any character, including a newline.
Interprets letters according to the Unicode character set. This flag
re.U
affects the behavior of \w, \W, \b, \B.
Permits "cuter" regular expression syntax. It ignores whitespace
re.X (except inside a set [] or when escaped by a backslash) and treats
unescaped # as a comment marker.
• USAGE
>>> re.findall(pattern, string, re.M | re.S)
22
PYTHON REGEX EXAMPLE
23