Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Regular Expression Back References in Python



Backreferences in regular expressions allow us to reuse a previously recorded group inside the same regex pattern. This ability is very useful when we want to match recurrent patterns in strings.

What are Backreferences?

A regular expression reference to a previously recorded group is called a backreference. When parentheses "()" are used in a regex pattern, a group is formed. Each group is assigned a number; the number for the first group is 1. We can refer to these recorded groups in our regex by using the backslash \ after the group number.

Basic Syntax

Here is the basic syntax we can use to define a backreference -

  • (\w+): It is used to capture a word as the first group.

  • \1: It is used to refer to the first recorded group.

So, backreferences simplify the patterns that repeat. It is also used to match complex data structures like phrases and identifiers. It can efficiently find duplicates in the given strings.

Examples 1

This program checks a given string for words that appear more than once. After the regex (\w+) records a word, \1 matches the same word again after some whitespace. It returns a list of phrases that are repeated.

import re

# Pattern to find repeated words
pattern = r'\b(\w+)\s+\1\b'
text = "This is is just a test test string"

# Find all matches
matches = re.findall(pattern, text)

print("Repeated Words:", matches)

Here is the output of the above program -

Repeated Words: ['is', 'test']

Example 2

In this example, we will search for the same string pairs with a space between them. Immediately after capturing a word and checking if it recurs, the regex returns the duplicated strings.

import re

# Pattern to find duplicate strings
pattern = r'([a-zA-Z]+) \1'
text = "hello hello world world example example"

# Find all matches
matches = re.findall(pattern, text)

print("Duplicated Strings:", matches)

Below is the result of the above program -

Duplicated Strings: ['hello', 'world', 'example']

Example 3

To validate simple hex color codes, the regex captures six hexadecimal digits. When the application finds duplicates of the same color code in the text, it gives valid matches.

import re

# Pattern to validate simple hex color codes
pattern = r'#([0-9A-Fa-f]{6})\s+#\1'
text = "#AFAFAF #AFAFAF this is not a color #123456 #123456"

# Find all matches
matches = re.findall(pattern, text)

print("Valid Hex Colors:", matches)

This will produce the following result -

Valid Hex Colors: ['AFAFAF', '123456']

Example 4

This program uses a regex to search for palindromic patterns. The regex exposes possible palindrome-forming sequences by removing a word to see if it immediately repeats itself.

import re

# Pattern to find palindromic patterns
pattern = r'(\w+)(?=\1)'
text = "madam racecar level deified not a palindrome"

# Find all matches
matches = re.findall(pattern, text)

print("Palindromic Patterns:", matches)

This will lead to the following outcome -

Palindromic Patterns: []
Updated on: 2025-04-24T18:07:20+05:30

757 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements