
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Check for Almost Similar Strings in Python
Strings in Python are sequences of characters used to represent textual data, enclosed in quotes. Checking for almost similar strings involves comparing and measuring their similarity or dissimilarity, enabling tasks like spell checking and approximate string matching using techniques such as Levenshtein distance or fuzzy matching algorithms.
In this article, we will learn a Python Program to check for almost similar Strings.
Demonstration
Assume we have taken an input string
Input
Input string 1: aazmdaa Input string 2: aqqaccd k: 2
Output
Checking whether both strings are similar: True
In this example, ?a' occurs 4 times in string1, and 2 times in string2, 4 - 2 = 2, in range, similarly, all chars in range, hence true.
Methods Used
The following are the various methods to accomplish this task:
Using for loop, ascii_lowecase, dictionary comprehension, and abs() functions
Using Counter() and max() functions
Using for loop, ascii_lowecase, dictionary comprehension, and abs() functions
In this method we are going to learn how to use simple for loop, ascii_lowecase, dictionary comprehension, and abs() functions to check for similar strings
Dictionary Comprehension Syntax
{key_expression: value_expression for item in iterable}
Dictionary comprehension is a compact and concise method in Python to create dictionaries by iterating over an iterable and defining key-value pairs based on expressions, allowing for efficient and readable code.
abs() Function Syntax
abs(number)
The abs() function in Python returns the absolute value of a number, which is the numerical value without considering its sign. It is useful for obtaining the magnitude or distance from zero of a given number.
Algorithm (Steps)
Following are the Algorithm/steps to be followed to perform the desired task
Use the import keyword to import ascii_lowercase from the string module.
Create a function findFrequency() that returns the frequency of characters of string by accepting input string as an argument
Take a dictionary and fill it with all lowercase alphabets as keys and values as 0.
Use the for loop to traverse through the input string.
Increment the frequency of the current character by 1.
Return the frequency of characters.
Create a variable to store the input string 1.
Create another variable to store the input string 2.
Print both the input strings.
Create another variable to store the input k value
Calling the above findFrequency() function to get the frequency of characters of input string 1 by passing the input string as an argument.
Similarly, get the frequency of characters of input string 2.
Initialize the result as True.
Use the for loop to traverse through the lowercase alphabets.
Use the if conditional statement to check whether the absolute difference of frequency of current characters of both strings is greater than k with the abs() function(returns the absolute value of a number)
Update the result as False if the condition is true.
Break the loop.
Print the result.
Example
The following program returns the whether the given strings are almost similar or not using for loop, ascii_lowecase, dictionary comprehension, and abs() functions
# importing ascii_lowercase from the string module from string import ascii_lowercase # creating a function that returns the frequency of characters of # of string by accepting input string as an argument def findFrequency(inputString): # Take a dictionary and filling with all lowercase alphabets as keys # With values as 0 frequency = {c: 0 for c in ascii_lowercase} # Traversing in the given string for c in inputString: # Incrementing the character frequency by 1 frequency[c] += 1 # returning the frequency of characters return frequency # input string 1 inputString_1 = 'aazmdaa' # input string 2 inputString_2 = "aqqaccd" # printing the input strings print("Input string 1: ", inputString_1) print("Input string 2: ", inputString_2) # input K value K = 2 # getting the frequency of characters of input string 1 # by calling the above findFrequency() function stringFrequency1 = findFrequency(inputString_1) # getting the frequency of characters of input string 2 stringFrequency2 = findFrequency(inputString_2) # Initializing the result as True result = True # traversing through all the lowercase characters for c in ascii_lowercase: # checking whether the absolute difference # of frequency of current characters of both strings is greater than k if abs(stringFrequency1[c] - stringFrequency2[c]) > K: # updating False to the result if the condition is true result = False # break the loop break # printing the result print("Checking whether both strings are similar: ", result)
Output
On executing, the above program will generate the following output
Input string 1: aazmdaa Input string 2: aqqaccd Checking whether both strings are similar: True
Using Counter() and max() functions
In this method we are going to use the combination of Counter and max function to check for the string that is almost similar to the given string.
Counter() function: a sub-class that counts the hashable objects. It implicitly creates a hash table of an iterable when called/invoked.
counter_object = Counter(iterable)
Algorithm (Steps)
Following are the Algorithm/steps to be followed to perform the desired task
Use the import keyword to import the Counter function from the collections module.
Create another variable to store the input k value
Use the lower() function(converts all uppercase characters in a string to lowercase characters) to convert the input string 1 into lowercase then use the Counter() function to get the frequency of characters of input string 1.
In the same way, get the frequency of characters of input string 2 by converting it into lowercase first.
Initialize the result as True.
Use the if conditional statement to check whether the strings are similar or not.
The max() method(returns the highest-valued item/greatest number in an iterable)
Update the result as False if the condition is true.
Print the result.
Example
The following program returns the whether the given strings are almost similar or not using the counter(),max() functions
# importing Counter from the collections module from collections import Counter # input string 1 inputString_1 = 'aazmdaa' # input string 2 inputString_2 = "aqqaccd" # printing the input strings print("Input string 1: ", inputString_1) print("Input string 2: ", inputString_2) # input K value K = 2 # convertig the input string 1 into lowercase and then # getting the frequency of characters of input string 1 strFrequency_1 = Counter(inputString_1.lower()) # convertig the input string 2 into lowercase and then # getting the frequency of characters of input string 2 strFrequency_2 = Counter(inputString_2.lower()) # Initializing the result as True result = True # Checking whether the strings are similar or not if(max((strFrequency_1 - strFrequency_2).values()) > K or max((strFrequency_2 - strFrequency_1).values()) > K): # updating False to the result if the condition is true result = False # printing the result print("Checking whether both strings are similar: ", result)
Output
On executing, the above program will generate the following output
Input string 1: aazmdaa Input string 2: aqqaccd Checking whether both strings are similar: True
Conclusion
In this article, we have learned 2 different methods to check for almost similar Strings. We learned how to iterate through the lowercase alphabet. Using the dictionary(hashing) and counter() functions, we learned how to calculate the frequency of each character of the given string.