How to check a valid regex string using Python?
Last Updated :
31 Jul, 2023
A Regex (Regular Expression) is a sequence of characters used for defining a pattern. This pattern could be used for searching, replacing and other operations. Regex is extensively utilized in applications that require input validation, Password validation, Pattern Recognition, search and replace utilities (found in word processors) etc. This is due to the fact that regex syntax stays the same across different programming languages and implementations. Therefore, one having the grasp of it provides longevity across languages. In this article, we will be creating a program for checking the validity of a regex string.
The method we would be using will require a firm understanding of the try-except construct of python. Therefore, it would be wise if we touch upon that before moving over to the actual code.
Check a valid regex string Using Exception handling
Try except block is used to catch and handle exceptions encountered during the execution of a particular block of code (construct exists in other programming languages under the name try-catch). The general syntax of a try-except block is as follows:
try:
# Execute this code
.
.
except [Exception]:
# Execute this code, if an exception arises during the execution of the try block
.
.
In the above syntax, any code found within the try block would be executed. If an exception/error arises during the execution of the try block then (only) the except block is executed. If the try block executes without producing an exception, then the except block won't be executed. If a bare except clause is used, then it would catch any exception (and certain even System_Exits) encountered during the execution of try block. To prevent such from happening, it is generally a good practice to specify an exception after the except. This ensures that only after encountering that specific exception/error, the except block will execute. This prevents concealment of other errors encountered during the execution of the try block. Also, multiple except clauses can be used within the same try-except block, this enables it to trap a plethora of exceptions, and deal with them specifically. This construct contains other keywords as well such as finally, else etc. which aren't required in current context. Therefore, only the relevant sections are described.
Code:
In the following code, we would be specifying re.error as the exception in the except clause of the try-except block. This error is encountered when an invalid regex pattern is found, during the compilation of the pattern.
Python3
import re
# pattern is a string containing the regex pattern
pattern = r"[.*"
try:
re.compile(pattern)
except re.error:
print("Non valid regex pattern")
exit()
OutputNon valid regex pattern
Explanation:
Firstly we imported the re library, for enabling regex functionality in our code. Then we assigned a string containing the regex pattern to the variable pattern. The pattern provided is invalid as it contains an unclosed character class (in regex square brackets `[ ]`are used for defining a character class). We placed the re.compile() (used to compile regex patterns) function within the try block. This will firstly try to compile the pattern and if any exception occurs during the compilation, it would firstly check whether it is re.error, if it is then only the except block will execute. Otherwise, the exception will be displayed in the traceback, leading to program termination. The except block contains print statements that outputs the user defined message to the stdout and then exits the program (via exit()). Since the pattern provided is invalid (explained earlier) this lead to the except block getting executed.
Note:
The above code only deals with re.error exception. But other exceptions related to regex also exist such as RecursionError, Which needs to be dealt with specifically(by adding a separate except clause for that exception as well or changing the maximum stack depth using sys.setrecursionlimit() as of this case).
Checking whether the input string matches the Regex pattern
In the following example, we will test whether an input string matches a given regex pattern or not. This is assuming the regex pattern is a valid one (could be ensured using the aforementioned example). We would be checking whether the input string is a alphanumeric string (one containing alphabets or digits throughout its length) or not. We would be using the following class for checking the string:
^[A-Za-z0-9]+$
Even though there exists a special sequence in regex (\w)for finding alphanumeric characters. But we won't be using it as it contains the underscore ( _ ) in its character class (A-Za-z0-9_), which isn't considered as an alphanumeric character under most standards.
Code:
Python3
import re
# compiling the pattern for alphanumeric string
pat = re.compile(r"[A-Za-z0-9]+")
# Prompts the user for input string
test = input("Enter the string: ")
# Checks whether the whole string matches the re.pattern or not
if re.fullmatch(pat, test):
print(f"'{test}' is an alphanumeric string!")
else:
print(f"'{test}' is NOT a alphanumeric string!")
Output:
> Enter the string: DeepLearnedSupaSampling01
'DeepLearnedSupaSampling01' is an alphanumeric string!
> Enter the string: afore 89df
'afore 89df' is NOT a alphanumeric string!
Explanation:
Firstly, we compiled the regex pattern using re.compile(). The Regex pattern contains a character set which is used to specify that all alphabets (lowercase and uppercase) and digits (0-9) are to be included in the pattern. Following the class is the plus sign ( + ) which is a repetition qualifier. This allows the resulting Regular Expression search to match 1 or more repetitions of the preceding Regular Expression (for us which is the alphanumeric character set). Then we prompt the user for the input string. After which we passed the input string and compiled regex pattern to re.fullmatch(). This method checks if the whole string matches the regular expression pattern or not. If it does then it returns 1, otherwise a 0. Which we used inside the if-else construct to display the success/failure message accordingly.
Complexity :
Time complexity : O(n), where n is the length of the input string. The function iterates through the input string once to check if it matches the regular expression pattern, and then once more to check if it matches the fullmatch.
Space complexity : O(1), as the function creates a constant number of variables regardless of the input string size.
Similar Reads
Check for URL in a String - Python We are given a string that may contain one or more URLs and our task is to extract them efficiently. This is useful for web scraping, text processing, and data validation. For example:Input:s = "My Profile: https://auth.geeksforgeeks.org/user/Prajjwal%20/articles in the portal of https://www.geeksfo
3 min read
How to validate an IP address using ReGex Given an IP address, the task is to validate this IP address with the help of Regex (Regular Expression) in C++ as a valid IPv4 address or IPv6 address. If the IP address is not valid then print an invalid IP address. Examples: Input: str = "203.120.223.13" Output: Valid IPv4 Input: str = "000.12.23
5 min read
Categorize Password as Strong or Weak using Regex in Python Given a password, we have to categorize it as a strong or weak one. There are some checks that need to be met to be a strong password. For a weak password, we need to return the reason for it to be weak. Conditions to be fulfilled are: Minimum 9 characters and maximum 20 characters.Cannot be a newli
2 min read
Python - Check if String Contain Only Defined Characters using Regex In this article, we are going to see how to check whether the given string contains only a certain set of characters in Python. These defined characters will be represented using sets. Examples: Input: â657â let us say regular expression contains the following characters- (â78653â) Output: Valid Exp
2 min read
Check if String Contains Substring in Python This article will cover how to check if a Python string contains another string or a substring in Python. Given two strings, check whether a substring is in the given string. Input: Substring = "geeks" String="geeks for geeks"Output: yesInput: Substring = "geek" String="geeks for geeks"Output: yesEx
8 min read
Python - Check if substring present in string The task is to check if a specific substring is present within a larger string. Python offers several methods to perform this check, from simple string methods to more advanced techniques. In this article, we'll explore these different methods to efficiently perform this check.Using in operatorThis
2 min read
Find all the patterns of â1(0+)1â in a given string using Python Regex A string contains patterns of the form 1(0+)1 where (0+) represents any non-empty consecutive sequence of 0âs. Count all such patterns. The patterns are allowed to overlap. Note : It contains digits and lowercase characters only. The string is not necessarily a binary. 100201 is not a valid pattern.
2 min read
Check if a given string is binary string or not - Python The task of checking whether a given string is a binary string in Python involves verifying that the string contains only the characters '0' and '1'. A binary string is one that is composed solely of these two digits and no other characters are allowed. For example, the string "101010" is a valid bi
3 min read
Python - Check for float string Checking for float string refers to determining whether a given string can represent a floating-point number. A float string is a string that, when parsed, represents a valid float value, such as "3.14", "-2.0", or "0.001".For example:"3.14" is a float string."abc" is not a float string.Using try-ex
2 min read
re.MatchObject.span() Method in Python - regex re.MatchObject.span() method returns a tuple containing starting and ending index of the matched string. If group did not contribute to the match it returns(-1,-1). Syntax: re.MatchObject.span() Parameters: group (optional) By default this is 0. Return: A tuple containing starting and ending index o
2 min read