This commit is contained in:
Ilya Kantor 2019-09-06 16:48:59 +03:00
parent 681cae4b6a
commit 4232a53219
10 changed files with 315 additions and 342 deletions

View file

@ -403,7 +403,7 @@ alert(item1); // Cake
alert(item2); // Donut alert(item2); // Donut
``` ```
The whole `options` object except `extra` that was not mentioned, is assigned to corresponding variables: All properties of `options` object except `extra` that is absent in the left part, are assigned to corresponding variables:
![](destructuring-complex.svg) ![](destructuring-complex.svg)

View file

@ -524,7 +524,7 @@ In the example below a non-method syntax is used for comparison. `[[HomeObject]]
```js run ```js run
let animal = { let animal = {
eat: function() { // should be the short syntax: eat() {...} eat: function() { // intentially writing like this instead of eat() {...
// ... // ...
} }
}; };

View file

@ -106,22 +106,27 @@ The best recipe is to be careful when using these events. If we want to track us
By default many elements do not support focusing. By default many elements do not support focusing.
The list varies between browsers, but one thing is always correct: `focus/blur` support is guaranteed for elements that a visitor can interact with: `<button>`, `<input>`, `<select>`, `<a>` and so on. The list varies a bit between browsers, but one thing is always correct: `focus/blur` support is guaranteed for elements that a visitor can interact with: `<button>`, `<input>`, `<select>`, `<a>` and so on.
From the other hand, elements that exist to format something like `<div>`, `<span>`, `<table>` -- are unfocusable by default. The method `elem.focus()` doesn't work on them, and `focus/blur` events are never triggered. From the other hand, elements that exist to format something, such as `<div>`, `<span>`, `<table>` -- are unfocusable by default. The method `elem.focus()` doesn't work on them, and `focus/blur` events are never triggered.
This can be changed using HTML-attribute `tabindex`. This can be changed using HTML-attribute `tabindex`.
The purpose of this attribute is to specify the order number of the element when `key:Tab` is used to switch between them. Any element becomes focusable if it has `tabindex`. The value of the attribute is the order number of the element when `key:Tab` (or something like that) is used to switch between them.
That is: if we have two elements, the first has `tabindex="1"`, and the second has `tabindex="2"`, then pressing `key:Tab` while in the first element -- moves us to the second one. That is: if we have two elements, the first has `tabindex="1"`, and the second has `tabindex="2"`, then pressing `key:Tab` while in the first element -- moves the focus into the second one.
The switch order is: elements with `tabindex` from `1` and above go first (in the `tabindex` order), and then elements without `tabindex` (e.g. a regular `<input>`).
Elements with matching `tabindex` are switched in the document source order (the default order).
There are two special values: There are two special values:
- `tabindex="0"` makes the element the last one. - `tabindex="0"` puts an element among those without `tabindex`. That is, when we switch elements, elements with `tabindex=0` go after elements with `tabindex ≥ 1`.
- `tabindex="-1"` means that `key:Tab` should ignore that element.
**Any element supports focusing if it has `tabindex`.** Usually it's used to make an element focusable, but keep the default switching order. To make an element a part of the form on par with `<input>`.
- `tabindex="-1"` allows only programmatic focusing on an element. The `key:Tab` key ignores such elements, but method `elem.focus()` works.
For instance, here's a list. Click the first item and press `key:Tab`: For instance, here's a list. Click the first item and press `key:Tab`:
@ -140,9 +145,9 @@ Click the first item and press Tab. Keep track of the order. Please note that ma
</style> </style>
``` ```
The order is like this: `1 - 2 - 0` (zero is always the last). Normally, `<li>` does not support focusing, but `tabindex` full enables it, along with events and styling with `:focus`. The order is like this: `1 - 2 - 0`. Normally, `<li>` does not support focusing, but `tabindex` full enables it, along with events and styling with `:focus`.
```smart header="`elem.tabIndex` works too" ```smart header="The property `elem.tabIndex` works too"
We can add `tabindex` from JavaScript by using the `elem.tabIndex` property. That has the same effect. We can add `tabindex` from JavaScript by using the `elem.tabIndex` property. That has the same effect.
``` ```

View file

@ -81,7 +81,7 @@ It has 3 working modes:
```js run ```js run
let str = "We will, we will rock you"; let str = "We will, we will rock you";
alert( str.match(/we/gi) ); // We,we (an array of 2 matches) alert( str.match(/we/gi) ); // We,we (an array of 2 substrings that match)
``` ```
Please note that both `match:We` and `match:we` are found, because flag `pattern:i` makes the regular expression case-insensitive. Please note that both `match:We` and `match:we` are found, because flag `pattern:i` makes the regular expression case-insensitive.
@ -159,9 +159,9 @@ The method `regexp.test(str)` looks for at least one match, if found, returns `t
```js run ```js run
let str = "I love JavaScript"; let str = "I love JavaScript";
let reg = /LOVE/i; let regexp = /LOVE/i;
alert( reg.test(str) ); // true alert( regexp.test(str) ); // true
``` ```
Further in this chapter we'll study more regular expressions, come across many other examples and also meet other methods. Further in this chapter we'll study more regular expressions, come across many other examples and also meet other methods.

View file

@ -13,9 +13,9 @@ For instance, the let's find the first digit in the phone number:
```js run ```js run
let str = "+7(903)-123-45-67"; let str = "+7(903)-123-45-67";
let reg = /\d/; let regexp = /\d/;
alert( str.match(reg) ); // 7 alert( str.match(regexp) ); // 7
``` ```
Without the flag `pattern:g`, the regular expression only looks for the first match, that is the first digit `pattern:\d`. Without the flag `pattern:g`, the regular expression only looks for the first match, that is the first digit `pattern:\d`.
@ -25,12 +25,12 @@ Let's add the `pattern:g` flag to find all digits:
```js run ```js run
let str = "+7(903)-123-45-67"; let str = "+7(903)-123-45-67";
let reg = /\d/g; let regexp = /\d/g;
alert( str.match(reg) ); // array of matches: 7,9,0,3,1,2,3,4,5,6,7 alert( str.match(regexp) ); // array of matches: 7,9,0,3,1,2,3,4,5,6,7
// let's make the digits-only phone number of them: // let's make the digits-only phone number of them:
alert( str.match(reg).join('') ); // 79035419441 alert( str.match(regexp).join('') ); // 79035419441
``` ```
That was a character class for digits. There are other character classes as well. That was a character class for digits. There are other character classes as well.
@ -54,9 +54,9 @@ For instance, `pattern:CSS\d` matches a string `match:CSS` with a digit after it
```js run ```js run
let str = "Is there CSS4?"; let str = "Is there CSS4?";
let reg = /CSS\d/ let regexp = /CSS\d/
alert( str.match(reg) ); // CSS4 alert( str.match(regexp) ); // CSS4
``` ```
Also we can use many character classes: Also we can use many character classes:
@ -113,11 +113,11 @@ alert( "Z".match(/./) ); // Z
Or in the middle of a regexp: Or in the middle of a regexp:
```js run ```js run
let reg = /CS.4/; let regexp = /CS.4/;
alert( "CSS4".match(reg) ); // CSS4 alert( "CSS4".match(regexp) ); // CSS4
alert( "CS-4".match(reg) ); // CS-4 alert( "CS-4".match(regexp) ); // CS-4
alert( "CS 4".match(reg) ); // CS 4 (space is also a character) alert( "CS 4".match(regexp) ); // CS 4 (space is also a character)
``` ```
Please note that a dot means "any character", but not the "absense of a character". There must be a character to match it: Please note that a dot means "any character", but not the "absense of a character". There must be a character to match it:

View file

@ -118,9 +118,9 @@ For instance, let's look for hexadecimal numbers, written as `xFF`, where `F` is
A hex digit can be denoted as `pattern:\p{Hex_Digit}`: A hex digit can be denoted as `pattern:\p{Hex_Digit}`:
```js run ```js run
let reg = /x\p{Hex_Digit}\p{Hex_Digit}/u; let regexp = /x\p{Hex_Digit}\p{Hex_Digit}/u;
alert("number: xAF".match(reg)); // xAF alert("number: xAF".match(regexp)); // xAF
``` ```
### Example: Chinese hieroglyphs ### Example: Chinese hieroglyphs

View file

@ -56,9 +56,9 @@ If we are creating a regular expression with `new RegExp`, then we don't have to
For instance, consider this: For instance, consider this:
```js run ```js run
let reg = new RegExp("\d\.\d"); let regexp = new RegExp("\d\.\d");
alert( "Chapter 5.1".match(reg) ); // null alert( "Chapter 5.1".match(regexp) ); // null
``` ```
The similar search in one of previous examples worked with `pattern:/\d\.\d/`, but `new RegExp("\d\.\d")` doesn't work, why? The similar search in one of previous examples worked with `pattern:/\d\.\d/`, but `new RegExp("\d\.\d")` doesn't work, why?
@ -87,9 +87,9 @@ let regStr = "\\d\\.\\d";
*/!* */!*
alert(regStr); // \d\.\d (correct now) alert(regStr); // \d\.\d (correct now)
let reg = new RegExp(regStr); let regexp = new RegExp(regStr);
alert( "Chapter 5.1".match(reg) ); // 5.1 alert( "Chapter 5.1".match(regexp) ); // 5.1
``` ```
## Summary ## Summary

View file

@ -200,7 +200,8 @@ let results = '<h1> <h2>'.matchAll(/<(.*?)>/gi);
// results - is not an array, but an iterable object // results - is not an array, but an iterable object
alert(results); // [object RegExp String Iterator] alert(results); // [object RegExp String Iterator]
alert(results[0]); // undefined
alert(results[0]); // undefined (*)
results = Array.from(results); // let's turn it into array results = Array.from(results); // let's turn it into array
@ -208,7 +209,7 @@ alert(results[0]); // <h1>,h1 (1st tag)
alert(results[1]); // <h2>,h2 (2nd tag) alert(results[1]); // <h2>,h2 (2nd tag)
``` ```
As we can see, the first difference is very important. We can't get the match as `results[0]`, because that object isn't pseudoarray. We can turn it into a real `Array` using `Array.from`. There are more details about pseudoarrays and iterables in the article <info:iterable>. As we can see, the first difference is very important, as demonstrated in the line `(*)`. We can't get the match as `results[0]`, because that object isn't pseudoarray. We can turn it into a real `Array` using `Array.from`. There are more details about pseudoarrays and iterables in the article <info:iterable>.
There's no need in `Array.from` if we're looping over results: There's no need in `Array.from` if we're looping over results:
@ -228,6 +229,19 @@ for(let result of results) {
let [tag1, tag2] = '<h1> <h2>'.matchAll(/<(.*?)>/gi); let [tag1, tag2] = '<h1> <h2>'.matchAll(/<(.*?)>/gi);
``` ```
Every match, returned by `matchAll`, has the same format as returned by `match` without flag `pattern:g`: it's an array with additional properties `index` (match index in the string) and `input` (source string):
```js run
let results = '<h1> <h2>'.matchAll(/<(.*?)>/gi);
let [tag1, tag2] = results;
alert( tag1[0] ); // <h1>
alert( tag1[1] ); // h1
alert( tag1.index ); // 0
alert( tag1.input ); // <h1> <h2>
```
```smart header="Why is a result of `matchAll` an iterable object, not an array?" ```smart header="Why is a result of `matchAll` an iterable object, not an array?"
Why is the method designed like that? The reason is simple - for the optimization. Why is the method designed like that? The reason is simple - for the optimization.

View file

@ -1,73 +1,127 @@
# Sticky flag "y", searching at position # Sticky flag "y", searching at position
The flag `pattern:y` allows to perform the search at the given position in the source string.
To grasp the use case of `pattern:y` flag, and see how great it is, let's explore a practical use case. To grasp the use case of `pattern:y` flag, and see how great it is, let's explore a practical use case.
One of common tasks for regexps is "parsing": when we get a text and analyze it for logical components, build a structure. One of common tasks for regexps is "lexical analysis": we get a text, e.g. in a programming language, and analyze it for structural elements.
For instance, there are HTML parsers for browser pages, that turn text into a structured document. There are parsers for programming languages, like JavaScript, etc. For instance, HTML has tags and attributes, JavaScript code has functions, variables, and so on.
Writing parsers is a special area, with its own tools and algorithms, so we don't go deep in there, but there's a very common question in them, and, generally, for text analysis: "What kind of entity is at the given position?". Writing lexical analyzers is a special area, with its own tools and algorithms, so we don't go deep in there, but there's a common task: to read something at the given position.
For instance, for a programming language variants can be like: E.g. we have a code string `subject:let varName = "value"`, and we need to read the variable name from it, that starts at position `4`.
- Is it a "name" `pattern:\w+`?
- Or is it a number `pattern:\d+`?
- Or an operator `pattern:[+-/*]`?
- (a syntax error if it's not anything in the expected list)
So, we should try to match a couple of regular expressions, and make a decision what's at the given position. We'll look for variable name using regexp `pattern:\w+`. Actually, JavaScript variable names need a bit more complex regexp for accurate matching, but here it doesn't matter.
In JavaScript, how can we perform a search starting from a given position? Regular calls start searching from the text start. A call to `str.match(/\w+/)` will find only the first word in the line. Or all words with the flag `pattern:g`. But we need only one word at position `4`.
We'd like to avoid creating substrings, as this slows down the execution considerably. To search from the given position, we can use method `regexp.exec(str)`.
One option is to use `regexp.exec` with `regexp.lastIndex` property, but that's not what we need, as this would search the text starting from `lastIndex`, while we only need to text the match *exactly* at the given position. If the `regexp` doesn't have flags `pattern:g` or `pattern:y`, then this method looks for the first match in the string `str`, exactly like `str.match(regexp)`. Such simple no-flags case doesn't interest us here.
Here's a (failing) attempt to use `lastIndex`: If there's flag `pattern:g`, then it performs the search in the string `str`, starting from position stored in its `regexp.lastIndex` property. And, if it finds a match, then sets `regexp.lastIndex` to the index immediately after the match.
When a regexp is created, its `lastIndex` is `0`.
So, successive calls to `regexp.exec(str)` return matches one after another.
An example (with flag `pattern:g`):
```js run ```js run
let str = "(text before) function ..."; let str = 'let varName';
// attempting to find function at position 5: let regexp = /\w+/g;
let regexp = /function/g; // must use "g" flag, otherwise lastIndex is ignored alert(regexp.lastIndex); // 0 (initially lastIndex=0)
regexp.lastIndex = 5
alert (regexp.exec(str)); // function let word1 = regexp.exec(str);
alert(word1[0]); // let (1st word)
alert(regexp.lastIndex); // 3 (position after the match)
let word2 = regexp.exec(str);
alert(word2[0]); // varName (2nd word)
alert(regexp.lastIndex); // 11 (position after the match)
let word3 = regexp.exec(str);
alert(word3); // null (no more matches)
alert(regexp.lastIndex); // 0 (resets at search end)
``` ```
The match is found, because `regexp.exec` starts to search from the given position and goes on by the text, successfully matching "function" later. Every match is returned as an array with groups and additional properties.
We could work around that by checking if "`regexp.exec(str).index` property is `5`, and if not, ignore the match. But the main problem here is performance. The regexp engine does a lot of unnecessary work by scanning at further positions. The delays are clearly noticeable if the text is long, because there are many such searches in a parser. We can get all matches in the loop:
## The "y" flag
So we've came to the problem: how to search for a match exactly at the given position.
That's what `pattern:y` flag does. It makes the regexp search only at the `lastIndex` position.
Here's an example
```js run ```js run
let str = "(text before) function ..."; let str = 'let varName';
let regexp = /\w+/g;
*!* let result;
let regexp = /function/y;
regexp.lastIndex = 5;
*/!*
alert (regexp.exec(str)); // null (no match, unlike "g" flag!) while (result = regexp.exec(str)) {
alert( `Found ${result[0]} at position ${result.index}` );
*!* // Found let at position 0, then
regexp.lastIndex = 14; // Found varName at position 4
*/!* }
alert (regexp.exec(str)); // function (match!)
``` ```
As we can see, now the regexp is only matched at the given position. Such use of `regexp.exec` is an alternative to method `str.matchAll`.
So what `pattern:y` does is truly unique, and very important for writing parsers. Unlike other methods, we can set our own `lastIndex`, to start the search from the given position.
The `pattern:y` flag allows to test a regular expression exactly at the given position and when we understand what's there, we can move on -- step by step examining the text. For instance, let's find a word, starting from position `4`:
Without the flag the regexp engine always searches till the end of the text, that takes time, especially if the text is large. So our parser would be very slow. The `pattern:y` flag is exactly the right thing here. ```js run
let str = 'let varName = "value"';
let regexp = /\w+/g; // without flag "g", property lastIndex is ignored
*!*
regexp.lastIndex = 4;
*/!*
let word = regexp.exec(str);
alert(word); // varName
```
We performed a search of `pattern:\w+`, starting from position `regexp.lastIndex = 4`.
Please note: the search starts at position `lastIndex` and then goes further. If there's no word at position `lastIndex`, but it's somewhere after it, then it will be found:
```js run
let str = 'let varName = "value"';
let regexp = /\w+/g;
*!*
regexp.lastIndex = 3;
*/!*
let word = regexp.exec(str);
alert(word[0]); // varName
alert(word.index); // 4
```
...So, with flag `pattern:g` property `lastIndex` sets the starting position for the search.
**Flag `pattern:y` makes `regexp.exec` to look exactly at position `lastIndex`, not before, not after it.**
Here's the same search with flag `pattern:y`:
```js run
let str = 'let varName = "value"';
let regexp = /\w+/y;
regexp.lastIndex = 3;
alert( regexp.exec(str) ); // null (there's a space at position 3, not a word)
regexp.lastIndex = 4;
alert( regexp.exec(str) ); // varName (word at position 4)
```
As we can see, regexp `pattern:/\w+/y` doesn't match at position `3` (unlike the flag `pattern:g`), but matches at position `4`.
Imagine, we have a long text, and there are no matches in it, at all. Then searching with flag `pattern:g` will go till the end of the text, and this will take significantly more time than the search with flag `pattern:y`.
In such tasks like lexical analysis, there are usually many searches at an exact position. Using flag `pattern:y` is the key for a good performance.

View file

@ -1,207 +1,98 @@
# Methods of RegExp and String # Methods of RegExp and String
There are two sets of methods to deal with regular expressions. In this article we'll cover various methods that work with regexps in-depth.
1. First, regular expressions are objects of the built-in [RegExp](mdn:js/RegExp) class, which provides many methods. ## str.match(regexp)
2. Additionally, there are methods in regular strings that can work with regexps.
The method `str.match(regexp)` finds matches for `regexp` in the string `str`.
## Recipes It has 3 modes:
Which method to use depends on what we'd like to do. 1. If the `regexp` doesn't have flag `pattern:g`, then it returns the first match as an array with capturing groups and properties `index` (position of the match), `input` (input string, equals `str`):
Methods become much easier to understand if we separate them by their use in real-life tasks. ```js run
let str = "I love JavaScript";
So, here are general recipes, the details to follow: let result = str.match(/Java(Script)/);
**To search for all matches:** alert( result[0] ); // JavaScript (full match)
alert( result[1] ); // Script (first capturing group)
alert( result.length ); // 2
Use regexp `pattern:g` flag and: // Additional information:
- Get a flat array of matches -- `str.match(reg)` alert( result.index ); // 0 (match position)
- Get an array or matches with details -- `str.matchAll(reg)`. alert( result.input ); // I love JavaScript (source string)
```
**To search for the first match only:** 2. If the `regexp` has flag `pattern:g`, then it returns an array of all matches as strings, without capturing groups and other details.
- Get the full first match -- `str.match(reg)` (without `pattern:g` flag). ```js run
- Get the string position of the first match -- `str.search(reg)`. let str = "I love JavaScript";
- Check if there's a match -- `regexp.test(str)`.
- Find the match from the given position -- `regexp.exec(str)` (set `regexp.lastIndex` to position).
**To replace all matches:** let result = str.match(/Java(Script)/g);
- Replace with another string or a function result -- `str.replace(reg, str|func)`
**To split the string by a separator:** alert( result[0] ); // JavaScript
- `str.split(str|reg)` alert( result.length ); // 1
```
Now you can continue reading this chapter to get the details about every method... But if you're reading for the first time, then you probably want to know more about regexps. So you can move to the next chapter, and then return here if something about a method is unclear. 3. If there are no matches, no matter if there's flag `pattern:g` or not, `null` is returned.
## str.search(reg) That's an important nuance. If there are no matches, we don't get an empty array, but `null`. It's easy to make a mistake forgetting about it, e.g.:
We've seen this method already. It returns the position of the first match or `-1` if none found: ```js run
let str = "I love JavaScript";
```js run let result = str.match(/HTML/);
let str = "A drop of ink may make a million think";
alert( str.search( *!*/a/i*/!* ) ); // 0 (first match at zero position) alert(result); // null
``` alert(result.length); // Error: Cannot read property 'length' of null
```
**The important limitation: `search` only finds the first match.** If we want the result to be an array, we can write like this:
We can't find next matches using `search`, there's just no syntax for that. But there are other methods that can. ```js
let result = str.match(regexp) || [];
## str.match(reg), no "g" flag ```
The behavior of `str.match` varies depending on whether `reg` has `pattern:g` flag or not.
First, if there's no `pattern:g` flag, then `str.match(reg)` looks for the first match only.
The result is an array with that match and additional properties:
- `index` -- the position of the match inside the string,
- `input` -- the subject string.
For instance:
```js run
let str = "Fame is the thirst of youth";
let result = str.match( *!*/fame/i*/!* );
alert( result[0] ); // Fame (the match)
alert( result.index ); // 0 (at the zero position)
alert( result.input ); // "Fame is the thirst of youth" (the string)
```
A match result may have more than one element.
**If a part of the pattern is delimited by parentheses `(...)`, then it becomes a separate element in the array.**
If parentheses have a name, designated by `(?<name>...)` at their start, then `result.groups[name]` has the content. We'll see that later in the chapter [about groups](info:regexp-groups).
For instance:
```js run
let str = "JavaScript is a programming language";
let result = str.match( *!*/JAVA(SCRIPT)/i*/!* );
alert( result[0] ); // JavaScript (the whole match)
alert( result[1] ); // script (the part of the match that corresponds to the parentheses)
alert( result.index ); // 0
alert( result.input ); // JavaScript is a programming language
```
Due to the `pattern:i` flag the search is case-insensitive, so it finds `match:JavaScript`. The part of the match that corresponds to `pattern:SCRIPT` becomes a separate array item.
So, this method is used to find one full match with all details.
## str.match(reg) with "g" flag
When there's a `"g"` flag, then `str.match` returns an array of all matches. There are no additional properties in that array, and parentheses do not create any elements.
For instance:
```js run
let str = "HO-Ho-ho!";
let result = str.match( *!*/ho/ig*/!* );
alert( result ); // HO, Ho, ho (array of 3 matches, case-insensitive)
```
Parentheses do not change anything, here we go:
```js run
let str = "HO-Ho-ho!";
let result = str.match( *!*/h(o)/ig*/!* );
alert( result ); // HO, Ho, ho
```
**So, with `pattern:g` flag `str.match` returns a simple array of all matches, without details.**
If we want to get information about match positions and contents of parentheses then we should use `matchAll` method that we'll cover below.
````warn header="If there are no matches, `str.match` returns `null`"
Please note, that's important. If there are no matches, the result is not an empty array, but `null`.
Keep that in mind to evade pitfalls like this:
```js run
let str = "Hey-hey-hey!";
alert( str.match(/Z/g).length ); // Error: Cannot read property 'length' of null
```
Here `str.match(/Z/g)` is `null`, it has no `length` property.
````
## str.matchAll(regexp) ## str.matchAll(regexp)
The method `str.matchAll(regexp)` is used to find all matches with all details. [recent browser="new"]
For instance: The method `str.matchAll(regexp)` is a "newer, improved" variant of `str.match`.
It's used mainly to search for all matches with all groups.
There are 3 differences from `match`:
1. It returns an iterable object with matches instead of an array. We can make a regular array from it using `Array.from`.
2. Every match is returned as an array with capturing groups (the same format as `str.match` without flag `pattern:g`).
3. If there are no results, it returns not `null`, but an empty iterable object.
Usage example:
```js run ```js run
let str = "Javascript or JavaScript? Should we uppercase 'S'?"; let str = '<h1>Hello, world!</h1>';
let reg = /<(.*?)>/g;
let result = str.matchAll( *!*/java(script)/ig*/!* ); let matchAll = str.matchAll(reg);
let [match1, match2] = result; alert(matchAll); // [object RegExp String Iterator], not array, but an iterable
alert( match1[0] ); // Javascript (the whole match) matchAll = Array.from(matchAll); // array now
alert( match1[1] ); // script (the part of the match that corresponds to the parentheses)
alert( match1.index ); // 0
alert( match1.input ); // = str (the whole original string)
alert( match2[0] ); // JavaScript (the whole match) let firstMatch = matchAll[0];
alert( match2[1] ); // Script (the part of the match that corresponds to the parentheses) alert( firstMatch[0] ); // <h1>
alert( match2.index ); // 14 alert( firstMatch[1] ); // h1
alert( match2.input ); // = str (the whole original string) alert( firstMatch.index ); // 0
alert( firstMatch.input ); // <h1>Hello, world!</h1>
``` ```
````warn header="`matchAll` returns an iterable, not array" If we use `for..of` to loop over `matchAll` matches, then we don't need `Array.from`, разумеется, не нужен.
For instance, if we try to get the first match by index, it won't work:
```js run
let str = "Javascript or JavaScript??";
let result = str.matchAll( /javascript/ig );
*!*
alert(result[0]); // undefined (?! there must be a match)
*/!*
```
The reason is that the iterator is not an array. We need to run `Array.from(result)` on it, or use `for..of` loop to get matches.
In practice, if we need all matches, then `for..of` works, so it's not a problem.
And, to get only few matches, we can use destructuring:
```js run
let str = "Javascript or JavaScript??";
*!*
let [firstMatch] = str.matchAll( /javascript/ig );
*/!*
alert(firstMatch); // Javascript
```
````
```warn header="`matchAll` is supernew, may need a polyfill"
The method may not work in old browsers. A polyfill might be needed (this site uses core-js).
Or you could make a loop with `regexp.exec`, explained below.
```
## str.split(regexp|substr, limit) ## str.split(regexp|substr, limit)
Splits the string using the regexp (or a substring) as a delimiter. Splits the string using the regexp (or a substring) as a delimiter.
We already used `split` with strings, like this: We can use `split` with strings, like this:
```js run ```js run
alert('12-34-56'.split('-')) // array of [12, 34, 56] alert('12-34-56'.split('-')) // array of [12, 34, 56]
@ -210,9 +101,23 @@ alert('12-34-56'.split('-')) // array of [12, 34, 56]
But we can split by a regular expression, the same way: But we can split by a regular expression, the same way:
```js run ```js run
alert('12-34-56'.split(/-/)) // array of [12, 34, 56] alert('12, 34, 56'.split(/,\s*/)) // array of [12, 34, 56]
``` ```
## str.search(regexp)
The method `str.search(regexp)` returns the position of the first match or `-1` if none found:
```js run
let str = "A drop of ink may make a million think";
alert( str.search( /ink/i ) ); // 10 (first match position)
```
**The important limitation: `search` only finds the first match.**
If we need positions of further matches, we should use other means, such as finding them all with `str.matchAll(regexp)`.
## str.replace(str|reg, str|func) ## str.replace(str|reg, str|func)
This is a generic method for searching and replacing, one of most useful ones. The swiss army knife for searching and replacing. This is a generic method for searching and replacing, one of most useful ones. The swiss army knife for searching and replacing.
@ -226,11 +131,11 @@ alert('12-34-56'.replace("-", ":")) // 12:34-56
There's a pitfall though. There's a pitfall though.
**When the first argument of `replace` is a string, it only looks for the first match.** **When the first argument of `replace` is a string, it only replaces the first match.**
You can see that in the example above: only the first `"-"` is replaced by `":"`. You can see that in the example above: only the first `"-"` is replaced by `":"`.
To find all dashes, we need to use not the string `"-"`, but a regexp `pattern:/-/g`, with an obligatory `pattern:g` flag: To find all hyphens, we need to use not the string `"-"`, but a regexp `pattern:/-/g`, with the obligatory `pattern:g` flag:
```js run ```js run
// replace all dashes by a colon // replace all dashes by a colon
@ -239,31 +144,15 @@ alert( '12-34-56'.replace( *!*/-/g*/!*, ":" ) ) // 12:34:56
The second argument is a replacement string. We can use special character in it: The second argument is a replacement string. We can use special character in it:
| Symbol | Inserts | | Symbols | Action in the replacement string |
| Symbols | Action in the replacement string |
|--------|--------| |--------|--------|
|`$$`|`"$"` | |`$&`|inserts the whole match|
|`$&`|the whole match| |<code>$&#096;</code>|inserts a part of the string before the match|
|<code>$&#096;</code>|a part of the string before the match| |`$'`|inserts a part of the string after the match|
|`$'`|a part of the string after the match| |`$n`|if `n` is a 1-2 digit number, inserts the contents of n-th capturing group, for details see [](info:regexp-groups)|
|`$n`|if `n` is a 1-2 digit number, then it means the contents of n-th parentheses counting from left to right, otherwise it means a parentheses with the given name | |`$<name>`|inserts the contents of the parentheses with the given `name`, for details see [](info:regexp-groups)|
|`$$`|inserts character `$` |
For instance if we use `$&` in the replacement string, that means "put the whole match here".
Let's use it to prepend all entries of `"John"` with `"Mr."`:
```js run
let str = "John Doe, John Smith and John Bull";
// for each John - replace it with Mr. and then John
alert(str.replace(/John/g, 'Mr.$&')); // Mr.John Doe, Mr.John Smith and Mr.John Bull
```
Quite often we'd like to reuse parts of the source string, recombine them in the replacement or wrap into something.
To do so, we should:
1. First, mark the parts by parentheses in regexp.
2. Use `$1`, `$2` (and so on) in the replacement string to get the content matched by 1st, 2nd and so on parentheses.
For instance: For instance:
@ -276,126 +165,137 @@ alert(str.replace(/(john) (smith)/i, '$2, $1')) // Smith, John
**For situations that require "smart" replacements, the second argument can be a function.** **For situations that require "smart" replacements, the second argument can be a function.**
It will be called for each match, and its result will be inserted as a replacement. It will be called for each match, and the returned value will be inserted as a replacement.
For instance: The function is called with arguments `func(match, p1, p2, ..., pn, offset, input, groups)`:
```js run 1. `match` -- the match,
let i = 0; 2. `p1, p2, ..., pn` -- contents of capturing groups (if there are any),
// replace each "ho" by the result of the function
alert("HO-Ho-ho".replace(/ho/gi, function() {
return ++i;
})); // 1-2-3
```
In the example above the function just returns the next number every time, but usually the result is based on the match.
The function is called with arguments `func(str, p1, p2, ..., pn, offset, input, groups)`:
1. `str` -- the match,
2. `p1, p2, ..., pn` -- contents of parentheses (if there are any),
3. `offset` -- position of the match, 3. `offset` -- position of the match,
4. `input` -- the source string, 4. `input` -- the source string,
5. `groups` -- an object with named groups (see chapter [](info:regexp-groups)). 5. `groups` -- an object with named groups.
If there are no parentheses in the regexp, then there are only 3 arguments: `func(str, offset, input)`. If there are no parentheses in the regexp, then there are only 3 arguments: `func(str, offset, input)`.
Let's use it to show full information about matches: For example, let's uppercase all matches:
```js run ```js run
// show and replace all matches let str = "html and css";
function replacer(str, offset, input) {
alert(`Found ${str} at position ${offset} in string ${input}`);
return str.toLowerCase();
}
let result = "HO-Ho-ho".replace(/ho/gi, replacer); let result = str.replace(/html|css/gi, str => str.toUpperCase());
alert( 'Result: ' + result ); // Result: ho-ho-ho
// shows each match: alert(result); // HTML and CSS
// Found HO at position 0 in string HO-Ho-ho
// Found Ho at position 3 in string HO-Ho-ho
// Found ho at position 6 in string HO-Ho-ho
``` ```
In the example below there are two parentheses, so `replacer` is called with 5 arguments: `str` is the full match, then parentheses, and then `offset` and `input`: Replace each match by its position in the string:
```js run ```js run
function replacer(str, name, surname, offset, input) { alert("Ho-Ho-ho".replace(/ho/gi, (match, offset) => offset)); // 0-3-6
// name is the first parentheses, surname is the second one ```
return surname + ", " + name;
}
In the example below there are two parentheses, so the replacement function is called with 5 arguments: the first is the full match, then 2 parentheses, and after it (not used in the example) the match position and the source string:
```js run
let str = "John Smith"; let str = "John Smith";
alert(str.replace(/(John) (Smith)/, replacer)) // Smith, John let result = str.replace(/(\w+) (\w+)/, (match, name, surname) => `${surname}, ${name}`);
alert(result); // Smith, John
```
If there are many groups, it's convenient to use rest parameters to access them:
Если в регулярном выражении много скобочных групп, то бывает удобно использовать остаточные аргументы для обращения к ним:
```js run
let str = "John Smith";
let result = str.replace(/(\w+) (\w+)/, (...match) => `${match[2]}, ${match[1]}`);
alert(result); // Smith, John
```
Or, if we're using named groups, then `groups` object with them is always the last, so we can obtain it like this:
```js run
let str = "John Smith";
let result = str.replace(/(?<name>\w+) (?<surname>\w+)/, (...match) => {
let groups = match.pop();
return `${groups.surname}, ${groups.name}`;
});
alert(result); // Smith, John
``` ```
Using a function gives us the ultimate replacement power, because it gets all the information about the match, has access to outer variables and can do everything. Using a function gives us the ultimate replacement power, because it gets all the information about the match, has access to outer variables and can do everything.
## regexp.exec(str) ## regexp.exec(str)
We've already seen these searching methods: The method `regexp.exec(str)` method returns a match for `regexp` in the string `str`. Unlike previous methods, it's called on a regexp, not on a string.
- `search` -- looks for the position of the match, It behaves differently depending on whether the regexp has flag `pattern:g`.
- `match` -- if there's no `pattern:g` flag, returns the first match with parentheses and all details,
- `match` -- if there's a `pattern:g` flag -- returns all matches, without details parentheses,
- `matchAll` -- returns all matches with details.
The `regexp.exec` method is the most flexible searching method of all. Unlike previous methods, `exec` should be called on a regexp, rather than on a string. If there's no `pattern:g`, then `regexp.exec(str)` returns the first match exactly as `str.match(reg)`. This behavior doesn't bring anything new.
It behaves differently depending on whether the regexp has the `pattern:g` flag. But if there's flag `pattern:g`, then:
- A call to `regexp.exec(str)` returns the first match and saves the position immediately after it in the property `regexp.lastIndex`.
- The next such call starts the search from position `regexp.lastIndex`, returns the next match and saves the position after it in `regexp.lastIndex`.
- ...And so on.
- If there are no matches, `regexp.exec` returns `null` and resets `regexp.lastIndex` to `0`.
If there's no `pattern:g`, then `regexp.exec(str)` returns the first match, exactly as `str.match(reg)`. Such behavior does not give us anything new. So, repeated calls return all matches one after another, using property `regexp.lastIndex` to keep track of the current search position.
But if there's `pattern:g`, then: In the past, before the method `str.matchAll` was added to JavaScript, calls of `regexp.exec` were used in the loop to get all matches with groups:
- `regexp.exec(str)` returns the first match and *remembers* the position after it in `regexp.lastIndex` property.
- The next call starts to search from `regexp.lastIndex` and returns the next match.
- If there are no more matches then `regexp.exec` returns `null` and `regexp.lastIndex` is set to `0`.
We could use it to get all matches with their positions and parentheses groups in a loop, instead of `matchAll`:
```js run ```js run
let str = 'A lot about JavaScript at https://javascript.info'; let str = 'More about JavaScript at https://javascript.info';
let regexp = /javascript/ig; let regexp = /javascript/ig;
let result; let result;
while (result = regexp.exec(str)) { while (result = regexp.exec(str)) {
alert( `Found ${result[0]} at ${result.index}` ); alert( `Found ${result[0]} at position ${result.index}` );
// shows: Found JavaScript at 12, then: // Found JavaScript at position 11, then
// shows: Found javascript at 34 // Found javascript at position 33
} }
``` ```
Surely, `matchAll` does the same, at least for modern browsers. But what `matchAll` can't do -- is to search from a given position. This works now as well, although for newer browsers `str.matchAll` is usually more convenient.
Let's search from position `13`. What we need is to assign `regexp.lastIndex=13` and call `regexp.exec`: **We can use `regexp.exec` to search from a given position by manually setting `lastIndex`.**
For instance:
```js run ```js run
let str = "A lot about JavaScript at https://javascript.info"; let str = 'Hello, world!';
let regexp = /javascript/ig; let reg = /\w+/g; // without flag "g", lastIndex property is ignored
*!* regexp.lastIndex = 5; // search from 5th position (from the comma)
regexp.lastIndex = 13;
*/!*
let result; alert( regexp.exec(str) ); // world
while (result = regexp.exec(str)) {
alert( `Found ${result[0]} at ${result.index}` );
// shows: Found javascript at 34
}
``` ```
Now, starting from the given position `13`, there's only one match. If the regexp has flag `pattern:y`, then the search will be performed exactly at the position `regexp.lastIndex`, not any further.
Let's replace flag `pattern:g` with `pattern:y` in the example above. There will be no matches, as there's no word at position `5`:
```js run
let str = 'Hello, world!';
let reg = /\w+/y;
regexp.lastIndex = 5; // search exactly at position 5
alert( regexp.exec(str) ); // null
```
That's convenient for situations when we need to "read" something from the string by a regexp at the exact position, not somewhere further.
## regexp.test(str) ## regexp.test(str)
The method `regexp.test(str)` looks for a match and returns `true/false` whether it finds it. The method `regexp.test(str)` looks for a match and returns `true/false` whether it exists.
For instance: For instance:
@ -416,7 +316,7 @@ alert( *!*/love/i*/!*.test(str) ); // false
alert( str.search(*!*/love/i*/!*) != -1 ); // false alert( str.search(*!*/love/i*/!*) != -1 ); // false
``` ```
If the regexp has `'g'` flag, then `regexp.test` advances `regexp.lastIndex` property, just like `regexp.exec`. If the regexp has flag `pattern:g`, then `regexp.test` looks from `regexp.lastIndex` property and updates it, just like `regexp.exec`.
So we can use it to search from a given position: So we can use it to search from a given position: