Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Oracle REGEXP_SUBSTR Function

Summary: in this tutorial, you’ll learn how to use the Oracle REGEXP_SUBSTR() function to extract a substring from a string using a regular expression.

Introduction to the Oracle REGEXP_SUBSTR function #

The REGEXP_SUBSTR() function allows you to extract a substring that matches a regular expression pattern from a string.

Here’s the syntax of the REGEXP_SUBSTR() function:

REGEXP_SUBSTR(source_string, pattern [, position [, occurrence [, return_option [, match_parameter]]]])Code language: SQL (Structured Query Language) (sql)

The REGEXP_SUBSTR() function accepts six arguments:

  • source_string is a string to be searched for.
  • pattern is the regular expression pattern that is used to search for in the source string.
  • start_position is a positive integer that indicates the starting position in the source string where the search begins. The start_position argument is optional. Its default value is 1. Therefore, if you don’t specify it explicitly, the REGEXP_SUBSTR() function will start searching at the beginning of the source string.
  • occurrence is a positive integer that specifies which occurrence of the search pattern that the REGEXP_SUBSTR() function should search for. The occurrence argument is also optional, and it defaults to 1, meaning that the REGEXP_SUBSTR() function should search for the first occurrence of the pattern in the source string.
  • match_parameter is a literal string that determines the default matching behavior for the REGEXP_SUBSTR() function.

You can use one or more following values for the match_parameter argument:

  • ‘i’ indicates case-insensitive matching
  • ‘c’ indicates case-sensitive matching.
  • ‘n’ allows the period (.) character to match the newline character. If you don’t explicitly specify this parameter, the REGEXP_SUBSTR() function will not use the period to match the newline character.
  • ‘m’ treats the source string as a multiline string.

Since the match_parameter argument is optional, if you omit it, the REGEXP_SUBSTR() function will behave as follows:

  • Case sensitivity matching is determined by NLS_SORT parameter.
  • The period (.) does not match the newline character.
  • The source string is treated as a single line.

The subexpr is a positive integer whose value is from 0 to 9 that indicates which sub-expression in the regular expression is the target.

The REGEXP_SUBSTR() function returns a portion of the source string that matches the regular expression.

The data type of the result string can be either VARCHAR2 or CLOB, and its character set is the same as the source string.

Oracle REGEXP_SUBSTR function examples #

Let’s take some examples of using the the REGEXP_SUBSTR function.

Basic REGEXP_SUBSTR function example #

The following example uses the REGEXP_SUBSTR function to extract a number from a string:

SELECT
  regexp_substr ( 'Oracle 23ai', '\d+' ) version
FROM
  dual;Code language: JavaScript (javascript)

Output:

oracle regexp_substr

Matching the second occurrence #

The following example uses the regexp_substr function to return the second number (2024) in a string using the occurrence.

SELECT
  regexp_substr ( 'Oracle 23ai in 2024', '\d+', 1, 2 ) year
FROM
  dual;Code language: JavaScript (javascript)

Output:

YEAR
----
2024

In this example, the \d+ regular expression matches one or more digits. It will return two matches:

  • 23
  • 2024

Since we instruct the function to return the second occurrence, it returns 2024.

Using the regexp_substr function with table data #

We’ll use the products table from the sample database:

products table

Suppose you want to get the maximum RAM that a motherboard can support, you can extract this information from the description column using the following regular expression:

'\d+(GB|TB)'Code language: SQL (Structured Query Language) (sql)

This regular expression match one or more numbers (\d+) followed by either GB or TB string (GB|TB) such as 64GB or 1TB.

The following query uses the regexp_substr function to extract the memory from the description of the products table:

SELECT
  product_id, 
  product_name, 
  description, 
  REGEXP_SUBSTR( description,  '\d+(GB|TB)' ) max_ram
FROM
  products
WHERE
  category_id = 4;Code language: SQL (Structured Query Language) (sql)

Try it

Output:

Oracle REGEXP_SUBSTR Function - extract RAM example

Summary #

  • Use the REGEXP_SUBSTR() to extract a substring that matches a regular expression from a string.
Was this tutorial helpful?