Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
15 views

Lect None On Chapter 2 - Part II - String and StringBuilder

Uploaded by

beshahashenafi32
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Lect None On Chapter 2 - Part II - String and StringBuilder

Uploaded by

beshahashenafi32
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 49

Char, String, StringBuilder, and

Regular Expressions
Overview
• The techniques in this section can be employed to
develop:
– text editors, word processors, page-layout software,
computerized typesetting systems and other kinds of text
processing software.

• Focus on the capabilities of:


– class String and type char - in System namespace,
– class StringBuilder - in System.Text namespace, and
– classes Regex and Match – in System.Text.RegularExpressions.
• Characters are the fundamental building blocks of C#
source code. It includes:
– Normal characters, Character constants (or character code
e.g 122 corresponds ‘Z’, 10 corresponds ‘\n’.
– established according to the Unicode character set.
• String is a series of characters treated as a single unit.
– Uppercase, lowercase letters, digits, and special characters (+, -, *, /,
$) and others.
– A string is an object of class String in the System
namespace.
– string literals (string constants) - sequences of characters in
double quotation marks. Eg. “Hello world”
• A string also contain multiple backslash characters
(e.g. in name of a file).
• @ character can be used to exclude escape sequences
and interpret all the characters in a string literally.
• E.g
– string file = "C:\\MyFolder\\MySubFolder\\MyFile.txt";
• It can be altered to
– string file = @"C:\MyFolder\MySubFolder\MyFile.txt";
• C# provides the string keyword as an alias for class
String
String constructors
• Class String provides eight constructors for
initializing strings in various ways.
• Line 25 - assigns to string3 a new string, using the
String constructor that takes a char array and two
int arguments.
– 2nd argument - specifies the starting index position.
– 3rd argument - specifies the number of characters
(count) to be copied.

• Line 26 - assigns to string4 a new string, using the


String constructor that takes as arguments a
character and an int specifying the number of
times to repeat that character in the string.
String Indexer, Length Property and CopyTo Method
• String indexer - facilitates the retrieval of any
character in the string, and
• String property Length - returns the length of
the string.
• String method CopyTo() - copies a specified
number of characters from a string into a char
array.
• The program determines - length of string, reverses order of
characters in the string, and copies a series of characters from
the string into a character array.
• Line 27 uses length property to determine the number of characters in
string string1. strings always know their own size.

• Lines 33–34 append to output the characters of the string string1 in


reverse order. The string indexer returns the character at a specific
position in the string. The string indexer treats a string as an array of chars.
The indexer receives an integer argument as the position number and
returns the character at that position. As with arrays, the first element of a
string is considered to be at position 0.
• Line 37 uses CopyTo method to copy the characters of a string
(string1) into a character array (characterArray).
– 1st argument is the index from which the method begins copying
characters in the string, and
– 2nd argument is the character array into which the characters are
copied.
– 3rd argument is the index specifying the location at which the method
places the copied characters in the character array.
– The last argument is the number of characters that the method will
copy from the string.
Comparing strings
• Computers can order characters alphabetically
– because the characters are represented internally
as Unicode numeric codes.
• String comparison - simply compares the
numeric codes of the characters in the strings.
• .NET provides several ways to compare strings.
– These are method - Equals(), CompareTo(), and
equality operator (==).
• Method Equals() - (inherited by String from class Object) - tests any two
objects for equality (i.e., checks objects contain identical contents).
– The return of the method is either true or false.
• Method Equals uses a lexicographical comparison—the integer Unicode
values. Compares the numeric Unicode values that represent the
characters in each string.
• Line 27 uses – method Equals() to compare string1 and literal string
"hello”.
• Comparisons are case sensitive. Look at the following that test
for string equality between string3 and string4 (line 39).
• Here, static method Equals (as opposed to the instance
method uses in previous slide) is used to compare the values
of two strings.
• Line 33 uses equality operator (==) to compare string string1 with the
literal string "hello" for equality. This also uses a lexicographical
comparison to compare two strings.
– Thus, the condition in the if structure evaluates to true, because the values of string1
and "hello" are equal.
• To compare the references of two strings, we must explicitly cast the
strings to type object and use the equality operator (==).
• Lines 46–54 use the String method CompareTo to compare strings.
– Method CompareTo returns 0 if the strings are equal, a -1 if the string that invokes
CompareTo is less than the string that is passed as an argument and
– a 1 if the string that invokes CompareTo is greater than the string that is passed as an
argument.
• Method CompareTo uses a lexicographical comparison.
Method StartsWith() and EndsWith()
• C# also provides ways to test whether a string
instance begins or ends with a given string.
• Method StartsWith determines whether a
string instance starts with the string text
passed to it as an argument.
• Method EndsWith determines whether a
string instance ends with the string text
passed to it as an argument.
• See demon in the next slid
• Line 21 uses method StartsWith, which takes a string argument. The
condition in the if structure determines whether the string at index i of
the array starts with the characters "st". If so, the method returns true and
appends strings[i] to string output for display purposes.
• Line 30 uses method EndsWith, which also takes a string argument. The
condition in the if structure determines whether the string at index i of
the array ends with the characters "ed". If so, the method returns true,
and strings[i] is appended to string output for display purposes.

• Reading assignment on String Method GetHashCode (pp 11)


Locating Characters and Substrings in Strings
• In many applications, it is necessary to search
for a character or set of characters in a string.
• The application in the next slide demonstrates
some of the many versions of String methods:
– IndexOf, IndexOfAny, LastIndexOf and
LastIndexOfAny, which search for a specified
character or substring in a string.
• Lines 20, 23 and 26 use method IndexOf to locate the first occurrence of a
character or substring in a string.
– If IndexOf finds a character, IndexOf returns the index of the specified character in the
string;
– otherwise, IndexOf returns –1.
• The expression on line 23 uses a version of method IndexOf that takes two
arguments—the character to search for and the starting index at which the search
of the string should begin.
– The method does not examine any characters that occur prior to the starting index (in
this case 1).
• The expression in line 26 uses another version of method IndexOf that takes
three arguments—the character to search for, the index at which to start searching
and the number of characters to search.
• Lines 30, 33 and 36 use method LastIndexOf to locate the last occurrence of a
character in a string.
• Method LastIndexOf performs the search from the end of the string toward the
beginning of the string.
– If method LastIndexOf finds the character, LastIndexOf returns the index of the specified
character in the string; otherwise, LastIndexOf returns –1.
• There are three versions of LastIndexOf .
• Line 30 uses LastIndexOf that takes as an argument the character for which to
search.
• Line 33 uses LastIndexOf that takes two arguments—the character for which to
search and the highest index from which to begin searching backward for the
character.
• Line 36 uses a third version of method LastIndexOf that takes three arguments— the
character for which to search, the starting index from which to start searching
backward and the number of characters (the portion of the string) to search.
Extracting Substrings from Strings
• Line 19 uses the Substring method that takes one int argument. The
argument specifies the starting index from which the method copies
characters in the original string.
– The substring returned contains a copy of the characters from the starting index to the
end of the string.

• Line 23 takes two int arguments. The first argument specifies the starting
index from which the method copies characters from the original string.
The second argument specifies the length of the substring to be copied.
The substring returned contains a copy of the specified characters from
the original string.
Concatenating string
• .net provide many ways to concatenate strings.
• The + operator:
– E.g. string name = “muna” ; name += “abay”;
• The static method Concat of class String concatenates two
strings and returns a new string containing the combined
characters from both original strings.
Miscellaneous String Methods
• Class String provides several methods that return modified copies of strings.
• The following demonstrates the use of String methods:
– Replace(), ToLower(), ToUpper(), Trim() and ToString().
• Line 27 uses String method Replace() to return a new string, replacing every
occurrence in string1 of character 'e' with character 'E'.
• Method Replace takes two arguments—a string for which to search and
another string with which to replace all matching occurrences of the first
argument. The original string remains unchanged. If there are no occurrences
of the first argument in the string, the method returns the original string.
• String method ToUpper generates a new string (line 31) that replaces any
lowercase letters in string1 with their uppercase equivalent.
• The method returns a new string containing the converted string; the original
string remains unchanged. If there are no characters to convert to uppercase, the
method returns the original string.
• Line 32 uses String method ToLower to return a new string in which any uppercase
letters in string1 are replaced by their lowercase equivalents. The original string is
unchanged. As with ToUpper, if there are no characters to convert to lowercase,
method ToLower returns the original string.
• Line 36 uses String method Trim to remove all whitespace characters that
appear at the beginning and end of a string. Without otherwise altering
the original string, the method returns a new string that contains the
string, but omits leading or trailing whitespace characters. Another version
of method Trim takes a character array and returns a string that does not
contain the characters in the array argument.
Class StringBuilder – namespace System.Text

• String class has many capabilities for processing strings.


• However a string’s contents can never change – immutable.
– Eg. Concatenation of string (+=) - create new string and assigns its reference to the
variable.
• class StringBuilder - used to create and manipulate dynamic string information
—i.e., mutable (changeable)
• Every StringBuilder can store a certain number of characters that’s specified by
its capacity. Exceeding the capacity of a StringBuilder causes the capacity to
expand to accommodate the additional characters.
– E.g concatenation method such as Append and AppendFormat – maintain without
creating any new string objects.
• StringBuilder is particularly useful for manipulating in place a large number of
strings, as it’s much more efficient than creating individual immutable strings.
StringBuilder Constructors

• Class StringBuilder provides six overloaded constructors.


– E.g.
– var buffer1 = new StringBuilder(); // with default initial
capacity
– var buffer2 = new StringBuilder(10); //initial capacity
spacified in int
– var buffer3 = new StringBuilder("hello");// initialized with
string content
• Output of:
– Console.WriteLine($"buffer1 = \"{buffer1}\""); // buffer1 = “ “
Length and Capacity Properties, EnsureCapacity Method and Indexer of Class StringBuilder

• Property - Length and Capacity


– Length - return the number of characters currently in a StringBuilder,
and
– Capacity – return the number of characters that a StringBuilder can
store without allocating more memory.
• used to increase or decrease the length or the capacity of the
StringBuilder.
• Method - EnsureCapacity
– allows to reduce the number of times that a StringBuilder’s capacity
must be increased.
• The method ensures that the StringBuilder’s capacity is at least
the specified value.
var buffer = new StringBuilder("Hello, how are you?");
// use Length and Capacity properties
Console.WriteLine($"buffer = {buffer}" + $"\nLength = { }" + $"\nCapacity = { }");
buffer.EnsureCapacity(75);
Console.WriteLine($"\nNew capacity = { }");
// truncate StringBuilder by setting Length property
Console.Write($"New length = { }\n\nbuffer = "); // use StringBuilder indexer
for (int i = 0; i < ; ++i)
{
Console.Write(buffer[i] );
} Console.WriteLine();
Append and AppendFormat Methods of Class StringBuilder

• Class StringBuilder provides overloaded Append


methods that allow various types of values to be
added to the end of a StringBuilder.
• The Framework Class Library provides versions for
each simple type and for character arrays, strings
and objects. (Remember that method ToString
produces a string representation of any object.)
• Each method takes an argument, converts it to a
string and appends it to the StringBuilder.
• object objectValue = "hello";
• var stringValue = "good bye";
• // use method Append to append values to buffer
• char[] characterArray = {'a', 'b', 'c', 'd', 'e', 'f'}; • buffer.Append(objectValue); buffer.Append(" ");
• var booleanValue = true; • buffer.Append(stringValue); buffer.Append(" ");
• var characterValue = 'Z';
• buffer.Append(characterArray); buffer.Append(" ");
• var integerValue = 7;
• var longValue = 1000000L; // L suffix indicates a long • buffer.Append(characterArray, 0, 3); buffer.Append("
literal ");
• var floatValue = 2.5F; // F suffix indicates a float literal • buffer.Append(booleanValue); buffer.Append(" ");
• var doubleValue = 33.333;
• buffer.Append(characterValue); buffer.Append(" ");
• var buffer = new StringBuilder();
• buffer.Append(integerValue); buffer.Append(" ");
• buffer.Append(longValue); buffer.Append(" ");
• buffer.Append(floatValue); buffer.Append(" ");
• buffer.Append(doubleValue);
• Console.WriteLine($"buffer = {buffer.ToString()}");
• Class StringBuilder also provides method
AppendFormat, which converts a string to a specified
format, then appends it to the StringBuilder.

– var buffer = new StringBuilder();


– // formatted string
– var string1 = "This {0} costs: {1:C}";
– // string1 argument array
– var objectArray = new object[2] {"car", 1234.56};
– // append to buffer formatted string with argument
– buffer.AppendFormat(string1, objectArray);
– Console.WriteLine(buffer.ToString());
Insert, Remove and Replace Methods of Class StringBuilder
• Class StringBuilder provides overloaded Insert methods
– to allow various types of data to be inserted at any position in a StringBuilder.
• The class provides versions for each simple type and for character arrays, strings and
objects.
• Each method takes its second argument, converts it to a string and inserts the string into
the StringBuilder in front of the character in the position specified by the first argument.
• The index specified by the first argument must be greater than or equal to 0 and less
than the StringBuilder’s length; otherwise, the program throws an
ArgumentOutOfRangeException.
• Class StringBuilder also provides method Remove for deleting any portion of a
StringBuilder.
• Method Remove takes two arguments—the index at which to begin deletion and the
number of characters to delete.
• The sum of the starting index and the number of characters to be deleted must always be
less than the StringBuilder’s length; otherwise, the program throws an
ArgumentOutOfRangeException.
The Insert and Remove methods are demonstrated
• object objectValue = "hello"; • buffer.Insert(0, objectValue); buffer.Insert(0, " ");
• var stringValue = "good bye"; • buffer.Insert(0, stringValue); buffer.Insert(0, " ");
• char[] characterArray = {'a', 'b', 'c', 'd', 'e', 'f'};
• var booleanValue = true; • buffer.Insert(0, characterArray); buffer.Insert(0, " ");
• var characterValue = 'K'; • buffer.Insert(0, booleanValue); buffer.Insert(0, " ");
• var integerValue = 7;
• var longValue = 1000000L; // L suffix indicates a long
• buffer.Insert(0, characterValue); buffer.Insert(0, " ");
literal • buffer.Insert(0, integerValue); buffer.Insert(0, " ");
• var floatValue = 2.5F; // F suffix indicates a float literal
• var doubleValue = 33.333; • buffer.Insert(0, longValue); buffer.Insert(0, " ");
• var buffer = new StringBuilder(); • buffer.Insert(0, floatValue); buffer.Insert(0, " ");
• buffer.Insert(0, doubleValue); buffer.Insert(0, " ");
• Console.WriteLine($"buffer after Inserts: \n{buffer}\n");
• buffer.Remove(10, 1); // delete 2 in 2.5
• buffer.Remove(4, 4); // delete .333 in 33.333
• Console.WriteLine($"buffer after Removes:\n{buffer}");
• Another useful method included with StringBuilder is
Replace, which searches for a specified string or character
and substitutes another string or character all occurrences.

– var builder1 = new StringBuilder("Happy Birthday Jane");


– var builder2 = new StringBuilder("goodbye greg");
– Console.WriteLine($"Before replacements:\n{builder1}\
n{builder2}");
– builder1.Replace("Jane", "Greg");
– builder2.Replace('g', 'G', 0, 5);//replace g by G if the char is found in the index spacefied i.e. 0 - 5
– Console.WriteLine($"\nAfter replacements:\n{builder1}\
n{builder2}");
Char Methods
• All struct types derive from class ValueType, which derives
from object. Also, all struct types are implicitly sealed.
• In the struct System.Char—which is the struct for characters
and represented by C# keyword char—most methods are
static, take at least one character argument and perform
either a test or a manipulation on the character.
• We present several of these in the next example. Figure
16.15 demonstrates static methods that test characters to
determine whether they’re of a specific character type and
static methods that perform case conversions on characters.
Regular Expressions and Class Regex
• Regular expressions are specially formatted strings:
– Used to find patters in text, and
– During information validation (data is in a particular format).
– E.g. the first three symbol of student id must be alphabet.
– Last name must start with a capital letter.
• Application of regular expression – to facilitate the
construction of a compiler.
– Large and complex reg exp – used to validate the syntax of a
program.
• In .net classes to recognize and manipulate regular expressions
are found in System.Text.RegularExpressions namespace.
• Class Regex – represent an immutable regular
expression.
– Contains static methods - such as
• Match() that returns an object of class Match
(represents a single regular expression match).
• Matches() finds all matches of a reguar expression in an
arbitrary string and returns a MatchCollection object
(set of Matches).
• Replace()
• Split()
• (see table in next slid) some character classes that can be used with
regular expressions.
• A character class is an escape sequence that represents a group of
characters.
– A word character is any alphanumeric character or underscore.
– A whitespace character is a space, a tab, a carriage return, a newline or a form
feed.
– A digit is any numeric character.
• Regular expressions are not limited to these character classes, however.
• The expressions employ various operators and other forms of notation
to search for complex patterns.
• We discuss several of these techniques in the context of the next
example.
• These are some character classes that can be used with regular
expressions.
• A character class is an escape sequence that represents a group of
characters.
– A word character is any alphanumeric character or underscore.
– A whitespace character is a space, a tab, a carriage return, a newline or
a form feed.
– A digit is any numeric character.
• Regular expressions are not limited to these character classes,
however. The expressions employ various operators and other
forms of notation to search for complex patterns.
• The regular expression in line 19 (see also below) searches for
a string that starts with the letter "J", followed by any number
of characters, followed by a two-digit number (of which the
second digit cannot be 4), followed by a dash, another two-
digit number, a dash and another two-digit number.
19
Method Replace() and Split() of Regex
• Regex class provides static and instance versions of methods
Replace and Split.
– Replace() – is useful to replace parts of a string with another, and,
– Split() – is useful to split a string according to a regular expression.
Replace() method
• Method Replace replaces text in a string with new text wherever the
original string matches a regular expression. It has two version – static and
instance method
• Static version of Replace()
• Takes three parameters—the string to modify, the string containing the
regular expression to match and the replacement string.
• Replace replaces every instance of "*" in testString1 with "^".
– Notice the regular expression (@"\*") precedes character * with a backslash, \.
• Normally, * is a quantifier indicating that a regular expression should
match any number of occurrences of a preceding pattern.
• Using Replace() instance method that uses the regular
expression passed to the constructor for testRegex1 to
perform the replacement operation. In this case, every match
for the regular expression "stars" in testString1 is replaced
with "carets".
• Use of instance method Replace() to
Split() method of Regex
• Method Split divides a string into several substrings. The original string is
broken in any location that matches a specified regular expression.
• Method Split returns an array containing the substrings between matches
for the regular expression.
• We use the static version of method Split to separate a string of comma-
separated

You might also like