DEV Community
If you’ve ever worked with strings in JavaScript—maybe trying to check if an email is valid in a form or clean up some messy input—you’ve probably run into something called regular expressions, or regex.
At first glance, regex looked like a bunch of gibberish to me—like what the heck is this—/\d{3}-?\d{3}-?\d{4}/. But later I realised, its not as scary as it seems. In fact, its just a way to describe patterns in text. Let me break it down for you in plain English.
Creating a regular expression
In JavaScript, a regular expression is an object, constructed with either the RegExp constructor or with forward slash (/) characters enclosing a pattern as a value (literal notation).
let pattern1 = new RegExp("hello");
let pattern2 = /hello/;
Both expressions do the same thing—they look for the word “hello” in a string. The second one is shorter and more commonly used.
Basic Matching
Just like normal objects, regular expressions also have methods. The most common method is test(), which accepts a string and returns a Boolean that tells us whether the string matches the pattern in the expression.
console.log(/cat/.test("black cat")); // true
console.log(/dog/.test("black cat")); // false
Match Sets of Characters
SETS OF CHARACTER
Placing a set of characters between square brackets matches that part of the regular expression to any of the characters within the brackets.
console.log(/[abcdefghijklmnopqrstuvwxyz]/.test("year 2021")); // true
RANGES OF CHARACTER
The above expression matches all strings that contain lowercase English letters. We can make the expression shorter by using a hyphen (-). A hyphen between two characters between square brackets represents a range of characters.
/[a-z]/.test("hello123") // true
/[0-9]/.test("hello123") // true
For a range of characters indicated with a hyphen, the ordering of the characters is determined by their Unicode number. For example, characters a-z (codes 97-122) are next to each in the Unicode ordering, and so using range [a-z] includes every character in this range and matches all lowercase Latin letters.
CHARACTER GROUPS SHORTHAND
In regular expressions, character sets/groups have a built-in shorthand for writing them. Digits ([0-9]) can be represented as \d. Here are some common character sets and what they mean:
Shorthand | Meaning |
---|---|
\d | Any digit (0-9) |
\w | Any word character (a-z, A-Z, 0-9, _) |
\s | Any whitespace character |
\D | Anything not a digit |
\W | Anything not a word character |
\S | Anything not a space |
If we want to match a phone number with format XXX-XXX-XXXX, here’s how we can do it:
let phoneNum = /\d\d\d-\d\d\d-\d\d\d\d/
console.log(phoneNum.test("202-588-6500")); // true
console.log(phoneNum.test("67-500-647")); // false
EXCLUDE CHARACTERS
The caret (^) character lets us invert a set of characters. That is, it matches any character except the character(s) in the given set.
console.log(/[^\d]/.test("ujdhf345kd")); // true
console.log(/[^\d]/.test("3453")); // false
SPECIAL CHARACTERS
Characters like plus signs (+) and question marks (?) have special meanings in regular expressions and need to be preceded by a backslash if we want to indicate the character itself.
let helloQuestion = /hello\?/
These shorthand codes can also be used within square brackets to indicate a set of characters. For example, [\d] represents any digit. When special characters like the plus (+) and the question mark (?) are used between square brackets, they lose their special meaning. So, [+?] matches any plus or question mark.
REPEATED PATTERNS
When we want to match things that repeat (like digits in a phone number), we use special symbols:
+ means "one or more times."
console.log(/\d+/.test("123")); // true (because 1, 2, 3 are digits)
console.log(/\d+/.test("abc")); // false (no digits)
It matches if at least one digit is there.
* means "zero or more times."
console.log(/a*/.test("aaa")); // true (matches all a's)
console.log(/a*/.test("")); // true (zero a's is also allowed!)
console.log(/a*/.test("bbb")); // true (even though there's no 'a', it matches zero a's)
So, it doesn't require the pattern to be present. It's okay if it's there many times, or not at all.
We can say how many times something should appear using curly braces {}.
• {3} means exactly 3 times
• {2, 4} means between 2 and 4 times
• {2,} means 2 or more times
console.log(/\d{3}/.test("123")); // true (exactly 3 digits)
console.log(/\d{3}/.test("12")); // false (only 2 digits)
console.log(/\d{2,4}/.test("1234")); // true (4 digits is allowed)
OPTIONAL CHARACTERS
To make a part of a pattern optional, we use the question mark (?). It allows a character to occur zero or one number of times.
For Example, Phone numbers are usually valid even when they are not hyphenated. We can make the hyphen optional. To make a part of a pattern optional, we use the question mark (?).
let phoneNum = /\d{3}-?\d{3}-?\d{4}/
console.log(phoneNum.test("202-588-6500")); // true
console.log(phoneNum.test("2025886500")); // true
In the above example, the pattern matches even when the hyphen character (-) is omitted.
GROUP CHARACTERS
We use Parentheses to group parts of a pattern, so that symbols like +, *, or {} apply to the entire group, not just a single character. When a part of a regular expression is surrounded by parentheses, it is treated as a single element by any operations following it.
let laugh = /(ha)+/;
console.log(laugh.test("hahaha")); // true (group "ha" repeated)
console.log(laugh.test("haa")); // false ("ha" not repeated properly)
CASE SENSITIVITY
We can add the letter i after the regex to make the pattern case-insensitive.
let greet = /hello/i;
console.log(greet.test("HELLO")); // true
console.log(greet.test("Hello")); // true
MATCHING WITHIN BOUNDARIES
To make a matching span through an entire string, we use:
• ^ → beginning of string
• $ → end of string
We can use both to make sure the whole string matches the pattern, not just part of it.
let onlyNumbers = /^\d+$/;
console.log(onlyNumbers.test("12345")); // true (only digits)
console.log(onlyNumbers.test("12a45")); // false (has a letter)
console.log(onlyNumbers.test(" 12345")); // false (starts with space)
WORD BOUNDARIES
The marker \b refers to a word boundary, which can be the start or end of the string. \b is like an invisible wall between words. It checks if something is at the start or end of a word. It can also refer to any place in the string that has a word character on one side and a non-word character on the other side.
console.log(/\bcat\b/.test("black cat")); // true (exact word "cat")
console.log(/\bcat\b/.test("category")); // false (not a whole word)
console.log(/\bcat/.test("catfish")); // true (starts with "cat")
ALTERNATIVES WITH THE OR OPERATOR
We use the pipe character (|) to indicate a choice between a pattern to its left and that to its right. For example, we can match a text that contains the word “watch” in either its plural (ending with “es”) form, past tense (ending with “ed”), or personal noun (ending with “er”) form.
let word = /\b\watch(es|ed|er)?\b/;
console.log(word.test("watch")); // true
console.log(word.test("watched")); // true
console.log(word.test("watching")); // false
Other methods for matching
exec()
We already know that the test() method just tells us whether something matches a pattern or not. It always gives result in either true or false. But if we want to see what is actually matched and where it is found in the string, then we use exec() method.
let execMatch = /\d+/.exec("abc 123");
console.log(execMatch); // Array [ "123" ]
console.log(execMatch.index); // 4
let execMatch2 = /\d+/.exec("abc");
console.log(execMatch2); // null
Here’s what’s happening:
- exec() finds "123" in the string "abc 123".
- It gives us an array where the first item is the matched text.
- It also adds a property called index that shows where in the string the match started (position 4 in this case).
- If there’s no match, exec() returns null.
match()
The match() method works on strings instead of patterns. But it behaves similarly to exec().
console.log("abc 123".match(/\d+/)); // [ "123" ]
Match and Replace
Sometimes, we want to replace part of a string with something else — for example, changing "a" to "e" in "haha". JavaScript gives us a method called .replace() for this.
console.log("haha".replace("a", "e")); // heha
Here, only the first "a" is replaced with "e".
Using Regex with replace():
We can also use regular expressions as the first argument of replace(). This is powerful because it lets us replace patterns, not just exact text.
console.log("hahehahehe".replace(/a/, "e"));
// hehehahehe (only replaces the first "a")
Replace All Matches with /g
Want to replace every match, not just the first? Add the g (global) flag:
console.log("hahehahehe".replace(/a/g, "e"));
// hehehehehe (all "a"s are now "e")
If we just want to replace all exact matches (not patterns), we can also use .replaceAll():
console.log("hahehahehe".replaceAll("a", "e"));
// hehehehehe
Using a Function in replace()
Instead of a string, we can also pass a function as the second argument. This lets us do something dynamic with each match.
Example: Convert some specific words to uppercase:
let phrase = "unicef is a humanitarian ngo.";
let result = phrase.replace(/\b(unicef|ngo)\b/g, word => word.toUpperCase());
console.log(result);
// UNICEF is a humanitarian NGO.
The regex looks for whole words unicef or ngo.
The function takes each matched word and returns it in uppercase.
For further actions, you may consider blocking this person and/or reporting abuse
Top comments (8)
Thanks for sharing
this helped me clear my doubts🥰
Helpful guide!
Brilliant info, devs need more of these blogs.
Very Important thing is discussed here. I am very impressed!
Informative 🔥
Knowledgeable and engaging
Very helpful ✌️🤌