6.5 KiB
Patterns and flags
Regular expressions is a powerful way to search and replace in text.
In JavaScript, they are available as RegExp object, and also integrated in methods of strings.
Regular Expressions
A regular expression (also "regexp", or just "reg") consists of a pattern and optional flags.
There are two syntaxes to create a regular expression object.
The "long" syntax:
regexp = new RegExp("pattern", "flags");
...And the short one, using slashes "/"
:
regexp = /pattern/; // no flags
regexp = /pattern/gmi; // with flags g,m and i (to be covered soon)
Slashes pattern:/.../
tell JavaScript that we are creating a regular expression. They play the same role as quotes for strings.
In both cases regexp
becomes an object of the built-in RegExp
class.
The main difference between these two syntaxes is that slashes pattern:/.../
do not allow to insert expressions (like strings with ${...}
). They are fully static.
Slashes are used when we know the regular expression at the code writing time -- and that's the most common situation. While new RegExp
is used when we need to create a regexp "on the fly", from a dynamically generated string, for instance:
let tag = prompt("What tag do you want to find?", "h2");
let regexp = new RegExp(`<${tag}>`); // same as /<h2>/ if answered "h2" in the prompt above
Flags
Regular expressions may have flags that affect the search.
There are only 6 of them in JavaScript:
pattern:i
- With this flag the search is case-insensitive: no difference between
A
anda
(see the example below). pattern:g
- With this flag the search looks for all matches, without it -- only the first one.
pattern:m
- Multiline mode (covered in the chapter info:regexp-multiline-mode).
pattern:s
- Enables "dotall" mode, that allows a dot
pattern:.
to match newline character\n
(covered in the chapter info:regexp-character-classes). pattern:u
- Enables full unicode support. The flag enables correct processing of surrogate pairs. More about that in the chapter info:regexp-unicode.
pattern:y
- "Sticky" mode: searching at the exact position in the text (covered in the chapter info:regexp-sticky)
From here on the color scheme is:
- regexp -- `pattern:red`
- string (where we search) -- `subject:blue`
- result -- `match:green`
Searching: str.match
As it was said previously, regular expressions are integrated with string methods.
The method str.match(regexp)
finds all matches of regexp
in the string str
.
It has 3 working modes:
-
If the regular expression has flag
pattern:g
, it returns an array of all matches:let str = "We will, we will rock you"; alert( str.match(/we/gi) ); // We,we (an array of 2 substrings that match)
Please note that both
match:We
andmatch:we
are found, because flagpattern:i
makes the regular expression case-insensitive. -
If there's no such flag it returns only the first match in the form of an array, with the full match at index
0
and some additional details in properties:let str = "We will, we will rock you"; let result = str.match(/we/i); // without flag g alert( result[0] ); // We (1st match) alert( result.length ); // 1 // Details: alert( result.index ); // 0 (position of the match) alert( result.input ); // We will, we will rock you (source string)
The array may have other indexes, besides
0
if a part of the regular expression is enclosed in parentheses. We'll cover that in the chapter info:regexp-groups. -
And, finally, if there are no matches,
null
is returned (doesn't matter if there's flagpattern:g
or not).That's a very important nuance. If there are no matches, we get not an empty array, but
null
. Forgetting about that may lead to errors, e.g.:let matches = "JavaScript".match(/HTML/); // = null if (!matches.length) { // Error: Cannot read property 'length' of null alert("Error in the line above"); }
If we'd like the result to be always an array, we can write it this way:
let matches = "JavaScript".match(/HTML/)*!* || []*/!*; if (!matches.length) { alert("No matches"); // now it works }
Replacing: str.replace
The method str.replace(regexp, replacement)
replaces matches with regexp
in string str
with replacement
(all matches, if there's flag pattern:g
, otherwise only the first one).
For instance:
// no flag g
alert( "We will, we will".replace(/we/i, "I") ); // I will, we will
// with flag g
alert( "We will, we will".replace(/we/ig, "I") ); // I will, I will
The second argument is the replacement
string. We can use special character combinations in it to insert fragments of the match:
Symbols | Action in the replacement string |
---|---|
$& |
inserts the whole match |
$` |
inserts a part of the string before the match |
$' |
inserts a part of the string after the match |
$n |
if n is a 1-2 digit number, then it inserts the contents of n-th parentheses, more about it in the chapter info:regexp-groups |
$<name> |
inserts the contents of the parentheses with the given name , more about it in the chapter info:regexp-groups |
$$ |
inserts character $ |
An example with pattern:$&
:
alert( "I love HTML".replace(/HTML/, "$& and JavaScript") ); // I love HTML and JavaScript
Testing: regexp.test
The method regexp.test(str)
looks for at least one match, if found, returns true
, otherwise false
.
let str = "I love JavaScript";
let regexp = /LOVE/i;
alert( regexp.test(str) ); // true
Further in this chapter we'll study more regular expressions, come across many other examples and also meet other methods.
Full information about the methods is given in the article info:regexp-methods.
Summary
- A regular expression consists of a pattern and optional flags:
pattern:g
,pattern:i
,pattern:m
,pattern:u
,pattern:s
,pattern:y
. - Without flags and special symbols that we'll study later, the search by a regexp is the same as a substring search.
- The method
str.match(regexp)
looks for matches: all of them if there'spattern:g
flag, otherwise only the first one. - The method
str.replace(regexp, replacement)
replaces matches withregexp
byreplacement
: all of them if there'spattern:g
flag, otherwise only the first one. - The method
regexp.test(str)
returnstrue
if there's at least one match, otherwisefalse
.