reg->regexp

This commit is contained in:
Ilya Kantor 2019-09-06 16:50:41 +03:00
parent 4232a53219
commit 32e20fc97c
35 changed files with 132 additions and 132 deletions

View file

@ -1,8 +1,8 @@
Answer: `pattern:\d\d[-:]\d\d`. Answer: `pattern:\d\d[-:]\d\d`.
```js run ```js run
let reg = /\d\d[-:]\d\d/g; let regexp = /\d\d[-:]\d\d/g;
alert( "Breakfast at 09:00. Dinner at 21-30".match(reg) ); // 09:00, 21-30 alert( "Breakfast at 09:00. Dinner at 21-30".match(regexp) ); // 09:00, 21-30
``` ```
Please note that the dash `pattern:'-'` has a special meaning in square brackets, but only between other characters, not when it's in the beginning or at the end, so we don't need to escape it. Please note that the dash `pattern:'-'` has a special meaning in square brackets, but only between other characters, not when it's in the beginning or at the end, so we don't need to escape it.

View file

@ -5,8 +5,8 @@ The time can be in the format `hours:minutes` or `hours-minutes`. Both hours and
Write a regexp to find time: Write a regexp to find time:
```js ```js
let reg = /your regexp/g; let regexp = /your regexp/g;
alert( "Breakfast at 09:00. Dinner at 21-30".match(reg) ); // 09:00, 21-30 alert( "Breakfast at 09:00. Dinner at 21-30".match(regexp) ); // 09:00, 21-30
``` ```
P.S. In this task we assume that the time is always correct, there's no need to filter out bad strings like "45:67". Later we'll deal with that too. P.S. In this task we assume that the time is always correct, there's no need to filter out bad strings like "45:67". Later we'll deal with that too.

View file

@ -130,18 +130,18 @@ In the example below the regexp `pattern:[-().^+]` looks for one of the characte
```js run ```js run
// No need to escape // No need to escape
let reg = /[-().^+]/g; let regexp = /[-().^+]/g;
alert( "1 + 2 - 3".match(reg) ); // Matches +, - alert( "1 + 2 - 3".match(regexp) ); // Matches +, -
``` ```
...But if you decide to escape them "just in case", then there would be no harm: ...But if you decide to escape them "just in case", then there would be no harm:
```js run ```js run
// Escaped everything // Escaped everything
let reg = /[\-\(\)\.\^\+]/g; let regexp = /[\-\(\)\.\^\+]/g;
alert( "1 + 2 - 3".match(reg) ); // also works: +, - alert( "1 + 2 - 3".match(regexp) ); // also works: +, -
``` ```
## Ranges and flag "u" ## Ranges and flag "u"

View file

@ -2,8 +2,8 @@
Solution: Solution:
```js run ```js run
let reg = /\.{3,}/g; let regexp = /\.{3,}/g;
alert( "Hello!... How goes?.....".match(reg) ); // ..., ..... alert( "Hello!... How goes?.....".match(regexp) ); // ..., .....
``` ```
Please note that the dot is a special character, so we have to escape it and insert as `\.`. Please note that the dot is a special character, so we have to escape it and insert as `\.`.

View file

@ -9,6 +9,6 @@ Create a regexp to find ellipsis: 3 (or more?) dots in a row.
Check it: Check it:
```js ```js
let reg = /your regexp/g; let regexp = /your regexp/g;
alert( "Hello!... How goes?.....".match(reg) ); // ..., ..... alert( "Hello!... How goes?.....".match(regexp) ); // ..., .....
``` ```

View file

@ -7,11 +7,11 @@ Then we can look for 6 of them using the quantifier `pattern:{6}`.
As a result, we have the regexp: `pattern:/#[a-f0-9]{6}/gi`. As a result, we have the regexp: `pattern:/#[a-f0-9]{6}/gi`.
```js run ```js run
let reg = /#[a-f0-9]{6}/gi; let regexp = /#[a-f0-9]{6}/gi;
let str = "color:#121212; background-color:#AA00ef bad-colors:f#fddee #fd2" let str = "color:#121212; background-color:#AA00ef bad-colors:f#fddee #fd2"
alert( str.match(reg) ); // #121212,#AA00ef alert( str.match(regexp) ); // #121212,#AA00ef
``` ```
The problem is that it finds the color in longer sequences: The problem is that it finds the color in longer sequences:

View file

@ -5,11 +5,11 @@ Create a regexp to search HTML-colors written as `#ABCDEF`: first `#` and then 6
An example of use: An example of use:
```js ```js
let reg = /...your regexp.../ let regexp = /...your regexp.../
let str = "color:#121212; background-color:#AA00ef bad-colors:f#fddee #fd2 #12345678"; let str = "color:#121212; background-color:#AA00ef bad-colors:f#fddee #fd2 #12345678";
alert( str.match(reg) ) // #121212,#AA00ef alert( str.match(regexp) ) // #121212,#AA00ef
``` ```
P.S. In this task we do not need other color formats like `#123` or `rgb(1,2,3)` etc. P.S. In this task we do not need other color formats like `#123` or `rgb(1,2,3)` etc.

View file

@ -5,11 +5,11 @@ An acceptable variant is `pattern:<!--.*?-->` -- the lazy quantifier makes the d
Otherwise multiline comments won't be found: Otherwise multiline comments won't be found:
```js run ```js run
let reg = /<!--.*?-->/gs; let regexp = /<!--.*?-->/gs;
let str = `... <!-- My -- comment let str = `... <!-- My -- comment
test --> .. <!----> .. test --> .. <!----> ..
`; `;
alert( str.match(reg) ); // '<!-- My -- comment \n test -->', '<!---->' alert( str.match(regexp) ); // '<!-- My -- comment \n test -->', '<!---->'
``` ```

View file

@ -3,11 +3,11 @@
Find all HTML comments in the text: Find all HTML comments in the text:
```js ```js
let reg = /your regexp/g; let regexp = /your regexp/g;
let str = `... <!-- My -- comment let str = `... <!-- My -- comment
test --> .. <!----> .. test --> .. <!----> ..
`; `;
alert( str.match(reg) ); // '<!-- My -- comment \n test -->', '<!---->' alert( str.match(regexp) ); // '<!-- My -- comment \n test -->', '<!---->'
``` ```

View file

@ -2,9 +2,9 @@
The solution is `pattern:<[^<>]+>`. The solution is `pattern:<[^<>]+>`.
```js run ```js run
let reg = /<[^<>]+>/g; let regexp = /<[^<>]+>/g;
let str = '<> <a href="/"> <input type="radio" checked> <b>'; let str = '<> <a href="/"> <input type="radio" checked> <b>';
alert( str.match(reg) ); // '<a href="/">', '<input type="radio" checked>', '<b>' alert( str.match(regexp) ); // '<a href="/">', '<input type="radio" checked>', '<b>'
``` ```

View file

@ -5,11 +5,11 @@ Create a regular expression to find all (opening and closing) HTML tags with the
An example of use: An example of use:
```js run ```js run
let reg = /your regexp/g; let regexp = /your regexp/g;
let str = '<> <a href="/"> <input type="radio" checked> <b>'; let str = '<> <a href="/"> <input type="radio" checked> <b>';
alert( str.match(reg) ); // '<a href="/">', '<input type="radio" checked>', '<b>' alert( str.match(regexp) ); // '<a href="/">', '<input type="radio" checked>', '<b>'
``` ```
Here we assume that tag attributes may not contain `<` and `>` (inside squotes too), that simplifies things a bit. Here we assume that tag attributes may not contain `<` and `>` (inside squotes too), that simplifies things a bit.

View file

@ -17,11 +17,11 @@ A regular expression like `pattern:/".+"/g` (a quote, then something, then the o
Let's try it: Let's try it:
```js run ```js run
let reg = /".+"/g; let regexp = /".+"/g;
let str = 'a "witch" and her "broom" is one'; let str = 'a "witch" and her "broom" is one';
alert( str.match(reg) ); // "witch" and her "broom" alert( str.match(regexp) ); // "witch" and her "broom"
``` ```
...We can see that it works not as intended! ...We can see that it works not as intended!
@ -105,11 +105,11 @@ To make things clear: usually a question mark `pattern:?` is a quantifier by its
The regexp `pattern:/".+?"/g` works as intended: it finds `match:"witch"` and `match:"broom"`: The regexp `pattern:/".+?"/g` works as intended: it finds `match:"witch"` and `match:"broom"`:
```js run ```js run
let reg = /".+?"/g; let regexp = /".+?"/g;
let str = 'a "witch" and her "broom" is one'; let str = 'a "witch" and her "broom" is one';
alert( str.match(reg) ); // witch, broom alert( str.match(regexp) ); // witch, broom
``` ```
To clearly understand the change, let's trace the search step by step. To clearly understand the change, let's trace the search step by step.
@ -175,11 +175,11 @@ With regexps, there's often more than one way to do the same thing.
In our case we can find quoted strings without lazy mode using the regexp `pattern:"[^"]+"`: In our case we can find quoted strings without lazy mode using the regexp `pattern:"[^"]+"`:
```js run ```js run
let reg = /"[^"]+"/g; let regexp = /"[^"]+"/g;
let str = 'a "witch" and her "broom" is one'; let str = 'a "witch" and her "broom" is one';
alert( str.match(reg) ); // witch, broom alert( str.match(regexp) ); // witch, broom
``` ```
The regexp `pattern:"[^"]+"` gives correct results, because it looks for a quote `pattern:'"'` followed by one or more non-quotes `pattern:[^"]`, and then the closing quote. The regexp `pattern:"[^"]+"` gives correct results, because it looks for a quote `pattern:'"'` followed by one or more non-quotes `pattern:[^"]`, and then the closing quote.
@ -201,20 +201,20 @@ The first idea might be: `pattern:/<a href=".*" class="doc">/g`.
Let's check it: Let's check it:
```js run ```js run
let str = '...<a href="link" class="doc">...'; let str = '...<a href="link" class="doc">...';
let reg = /<a href=".*" class="doc">/g; let regexp = /<a href=".*" class="doc">/g;
// Works! // Works!
alert( str.match(reg) ); // <a href="link" class="doc"> alert( str.match(regexp) ); // <a href="link" class="doc">
``` ```
It worked. But let's see what happens if there are many links in the text? It worked. But let's see what happens if there are many links in the text?
```js run ```js run
let str = '...<a href="link1" class="doc">... <a href="link2" class="doc">...'; let str = '...<a href="link1" class="doc">... <a href="link2" class="doc">...';
let reg = /<a href=".*" class="doc">/g; let regexp = /<a href=".*" class="doc">/g;
// Whoops! Two links in one match! // Whoops! Two links in one match!
alert( str.match(reg) ); // <a href="link1" class="doc">... <a href="link2" class="doc"> alert( str.match(regexp) ); // <a href="link1" class="doc">... <a href="link2" class="doc">
``` ```
Now the result is wrong for the same reason as our "witches" example. The quantifier `pattern:.*` took too many characters. Now the result is wrong for the same reason as our "witches" example. The quantifier `pattern:.*` took too many characters.
@ -230,10 +230,10 @@ Let's modify the pattern by making the quantifier `pattern:.*?` lazy:
```js run ```js run
let str = '...<a href="link1" class="doc">... <a href="link2" class="doc">...'; let str = '...<a href="link1" class="doc">... <a href="link2" class="doc">...';
let reg = /<a href=".*?" class="doc">/g; let regexp = /<a href=".*?" class="doc">/g;
// Works! // Works!
alert( str.match(reg) ); // <a href="link1" class="doc">, <a href="link2" class="doc"> alert( str.match(regexp) ); // <a href="link1" class="doc">, <a href="link2" class="doc">
``` ```
Now it seems to work, there are two matches: Now it seems to work, there are two matches:
@ -247,10 +247,10 @@ Now it seems to work, there are two matches:
```js run ```js run
let str = '...<a href="link1" class="wrong">... <p style="" class="doc">...'; let str = '...<a href="link1" class="wrong">... <p style="" class="doc">...';
let reg = /<a href=".*?" class="doc">/g; let regexp = /<a href=".*?" class="doc">/g;
// Wrong match! // Wrong match!
alert( str.match(reg) ); // <a href="link1" class="wrong">... <p style="" class="doc"> alert( str.match(regexp) ); // <a href="link1" class="wrong">... <p style="" class="doc">
``` ```
Now it fails. The match includes not just a link, but also a lot of text after it, including `<p...>`. Now it fails. The match includes not just a link, but also a lot of text after it, including `<p...>`.
@ -281,11 +281,11 @@ A working example:
```js run ```js run
let str1 = '...<a href="link1" class="wrong">... <p style="" class="doc">...'; let str1 = '...<a href="link1" class="wrong">... <p style="" class="doc">...';
let str2 = '...<a href="link1" class="doc">... <a href="link2" class="doc">...'; let str2 = '...<a href="link1" class="doc">... <a href="link2" class="doc">...';
let reg = /<a href="[^"]*" class="doc">/g; let regexp = /<a href="[^"]*" class="doc">/g;
// Works! // Works!
alert( str1.match(reg) ); // null, no matches, that's correct alert( str1.match(regexp) ); // null, no matches, that's correct
alert( str2.match(reg) ); // <a href="link1" class="doc">, <a href="link2" class="doc"> alert( str2.match(regexp) ); // <a href="link1" class="doc">, <a href="link2" class="doc">
``` ```
## Summary ## Summary

View file

@ -9,13 +9,13 @@ Now let's show that the match should capture all the text: start at the beginnin
Finally: Finally:
```js run ```js run
let reg = /^[0-9a-fA-F]{2}(:[0-9a-fA-F]{2}){5}$/i; let regexp = /^[0-9a-fA-F]{2}(:[0-9a-fA-F]{2}){5}$/i;
alert( reg.test('01:32:54:67:89:AB') ); // true alert( regexp.test('01:32:54:67:89:AB') ); // true
alert( reg.test('0132546789AB') ); // false (no colons) alert( regexp.test('0132546789AB') ); // false (no colons)
alert( reg.test('01:32:54:67:89') ); // false (5 numbers, need 6) alert( regexp.test('01:32:54:67:89') ); // false (5 numbers, need 6)
alert( reg.test('01:32:54:67:89:ZZ') ) // false (ZZ in the end) alert( regexp.test('01:32:54:67:89:ZZ') ) // false (ZZ in the end)
``` ```

View file

@ -8,13 +8,13 @@ Write a regexp that checks whether a string is MAC-address.
Usage: Usage:
```js ```js
let reg = /your regexp/; let regexp = /your regexp/;
alert( reg.test('01:32:54:67:89:AB') ); // true alert( regexp.test('01:32:54:67:89:AB') ); // true
alert( reg.test('0132546789AB') ); // false (no colons) alert( regexp.test('0132546789AB') ); // false (no colons)
alert( reg.test('01:32:54:67:89') ); // false (5 numbers, must be 6) alert( regexp.test('01:32:54:67:89') ); // false (5 numbers, must be 6)
alert( reg.test('01:32:54:67:89:ZZ') ) // false (ZZ ad the end) alert( regexp.test('01:32:54:67:89:ZZ') ) // false (ZZ ad the end)
``` ```

View file

@ -9,19 +9,19 @@ Here the pattern `pattern:[a-f0-9]{3}` is enclosed in parentheses to apply the q
In action: In action:
```js run ```js run
let reg = /#([a-f0-9]{3}){1,2}/gi; let regexp = /#([a-f0-9]{3}){1,2}/gi;
let str = "color: #3f3; background-color: #AA00ef; and: #abcd"; let str = "color: #3f3; background-color: #AA00ef; and: #abcd";
alert( str.match(reg) ); // #3f3 #AA00ef #abc alert( str.match(regexp) ); // #3f3 #AA00ef #abc
``` ```
There's a minor problem here: the pattern found `match:#abc` in `subject:#abcd`. To prevent that we can add `pattern:\b` to the end: There's a minor problem here: the pattern found `match:#abc` in `subject:#abcd`. To prevent that we can add `pattern:\b` to the end:
```js run ```js run
let reg = /#([a-f0-9]{3}){1,2}\b/gi; let regexp = /#([a-f0-9]{3}){1,2}\b/gi;
let str = "color: #3f3; background-color: #AA00ef; and: #abcd"; let str = "color: #3f3; background-color: #AA00ef; and: #abcd";
alert( str.match(reg) ); // #3f3 #AA00ef alert( str.match(regexp) ); // #3f3 #AA00ef
``` ```

View file

@ -4,11 +4,11 @@ Write a RegExp that matches colors in the format `#abc` or `#abcdef`. That is: `
Usage example: Usage example:
```js ```js
let reg = /your regexp/g; let regexp = /your regexp/g;
let str = "color: #3f3; background-color: #AA00ef; and: #abcd"; let str = "color: #3f3; background-color: #AA00ef; and: #abcd";
alert( str.match(reg) ); // #3f3 #AA00ef alert( str.match(regexp) ); // #3f3 #AA00ef
``` ```
P.S. This should be exactly 3 or 6 hex digits. Values with 4 digits, such as `#abcd`, should not match. P.S. This should be exactly 3 or 6 hex digits. Values with 4 digits, such as `#abcd`, should not match.

View file

@ -3,9 +3,9 @@ A positive number with an optional decimal part is (per previous task): `pattern
Let's add the optional `pattern:-` in the beginning: Let's add the optional `pattern:-` in the beginning:
```js run ```js run
let reg = /-?\d+(\.\d+)?/g; let regexp = /-?\d+(\.\d+)?/g;
let str = "-1.5 0 2 -123.4."; let str = "-1.5 0 2 -123.4.";
alert( str.match(reg) ); // -1.5, 0, 2, -123.4 alert( str.match(regexp) ); // -1.5, 0, 2, -123.4
``` ```

View file

@ -5,9 +5,9 @@ Write a regexp that looks for all decimal numbers including integer ones, with t
An example of use: An example of use:
```js ```js
let reg = /your regexp/g; let regexp = /your regexp/g;
let str = "-1.5 0 2 -123.4."; let str = "-1.5 0 2 -123.4.";
alert( str.match(reg) ); // -1.5, 0, 2, -123.4 alert( str.match(regexp) ); // -1.5, 0, 2, -123.4
``` ```

View file

@ -18,9 +18,9 @@ To make each of these parts a separate element of the result array, let's enclos
In action: In action:
```js run ```js run
let reg = /(-?\d+(\.\d+)?)\s*([-+*\/])\s*(-?\d+(\.\d+)?)/; let regexp = /(-?\d+(\.\d+)?)\s*([-+*\/])\s*(-?\d+(\.\d+)?)/;
alert( "1.2 + 12".match(reg) ); alert( "1.2 + 12".match(regexp) );
``` ```
The result includes: The result includes:
@ -42,9 +42,9 @@ The final solution:
```js run ```js run
function parse(expr) { function parse(expr) {
let reg = /(-?\d+(?:\.\d+)?)\s*([-+*\/])\s*(-?\d+(?:\.\d+)?)/; let regexp = /(-?\d+(?:\.\d+)?)\s*([-+*\/])\s*(-?\d+(?:\.\d+)?)/;
let result = expr.match(reg); let result = expr.match(regexp);
if (!result) return []; if (!result) return [];
result.shift(); result.shift();

View file

@ -56,9 +56,9 @@ The email format is: `name@domain`. Any word can be the name, hyphens and dots a
The pattern: The pattern:
```js run ```js run
let reg = /[-.\w]+@([\w-]+\.)+[\w-]+/g; let regexp = /[-.\w]+@([\w-]+\.)+[\w-]+/g;
alert("my@mail.com @ his@site.com.uk".match(reg)); // my@mail.com, his@site.com.uk alert("my@mail.com @ his@site.com.uk".match(regexp)); // my@mail.com, his@site.com.uk
``` ```
That regexp is not perfect, but mostly works and helps to fix accidental mistypes. The only truly reliable check for an email can only be done by sending a letter. That regexp is not perfect, but mostly works and helps to fix accidental mistypes. The only truly reliable check for an email can only be done by sending a letter.
@ -110,9 +110,9 @@ In action:
```js run ```js run
let str = '<span class="my">'; let str = '<span class="my">';
let reg = /<(([a-z]+)\s*([^>]*))>/; let regexp = /<(([a-z]+)\s*([^>]*))>/;
let result = str.match(reg); let result = str.match(regexp);
alert(result[0]); // <span class="my"> alert(result[0]); // <span class="my">
alert(result[1]); // span class="my" alert(result[1]); // span class="my"
alert(result[2]); // span alert(result[2]); // span
@ -336,10 +336,10 @@ let str = "Gogogo John!";
*!* *!*
// ?: exludes 'go' from capturing // ?: exludes 'go' from capturing
let reg = /(?:go)+ (\w+)/i; let regexp = /(?:go)+ (\w+)/i;
*/!* */!*
let result = str.match(reg); let result = str.match(regexp);
alert( result[0] ); // Gogogo John (full match) alert( result[0] ); // Gogogo John (full match)
alert( result[1] ); // John alert( result[1] ); // John

View file

@ -17,10 +17,10 @@ We can put both kinds of quotes in the square brackets: `pattern:['"](.*?)['"]`,
```js run ```js run
let str = `He said: "She's the one!".`; let str = `He said: "She's the one!".`;
let reg = /['"](.*?)['"]/g; let regexp = /['"](.*?)['"]/g;
// The result is not what we'd like to have // The result is not what we'd like to have
alert( str.match(reg) ); // "She' alert( str.match(regexp) ); // "She'
``` ```
As we can see, the pattern found an opening quote `match:"`, then the text is consumed till the other quote `match:'`, that closes the match. As we can see, the pattern found an opening quote `match:"`, then the text is consumed till the other quote `match:'`, that closes the match.
@ -33,10 +33,10 @@ Here's the correct code:
let str = `He said: "She's the one!".`; let str = `He said: "She's the one!".`;
*!* *!*
let reg = /(['"])(.*?)\1/g; let regexp = /(['"])(.*?)\1/g;
*/!* */!*
alert( str.match(reg) ); // "She's the one!" alert( str.match(regexp) ); // "She's the one!"
``` ```
Now it works! The regular expression engine finds the first quote `pattern:(['"])` and memorizes its content. That's the first capturing group. Now it works! The regular expression engine finds the first quote `pattern:(['"])` and memorizes its content. That's the first capturing group.
@ -65,8 +65,8 @@ In the example below the group with quotes is named `pattern:?<quote>`, so the b
let str = `He said: "She's the one!".`; let str = `He said: "She's the one!".`;
*!* *!*
let reg = /(?<quote>['"])(.*?)\k<quote>/g; let regexp = /(?<quote>['"])(.*?)\k<quote>/g;
*/!* */!*
alert( str.match(reg) ); // "She's the one!" alert( str.match(regexp) ); // "She's the one!"
``` ```

View file

@ -4,11 +4,11 @@ The first idea can be to list the languages with `|` in-between.
But that doesn't work right: But that doesn't work right:
```js run ```js run
let reg = /Java|JavaScript|PHP|C|C\+\+/g; let regexp = /Java|JavaScript|PHP|C|C\+\+/g;
let str = "Java, JavaScript, PHP, C, C++"; let str = "Java, JavaScript, PHP, C, C++";
alert( str.match(reg) ); // Java,Java,PHP,C,C alert( str.match(regexp) ); // Java,Java,PHP,C,C
``` ```
The regular expression engine looks for alternations one-by-one. That is: first it checks if we have `match:Java`, otherwise -- looks for `match:JavaScript` and so on. The regular expression engine looks for alternations one-by-one. That is: first it checks if we have `match:Java`, otherwise -- looks for `match:JavaScript` and so on.
@ -25,9 +25,9 @@ There are two solutions for that problem:
In action: In action:
```js run ```js run
let reg = /Java(Script)?|C(\+\+)?|PHP/g; let regexp = /Java(Script)?|C(\+\+)?|PHP/g;
let str = "Java, JavaScript, PHP, C, C++"; let str = "Java, JavaScript, PHP, C, C++";
alert( str.match(reg) ); // Java,JavaScript,PHP,C,C++ alert( str.match(regexp) ); // Java,JavaScript,PHP,C,C++
``` ```

View file

@ -5,7 +5,7 @@ There are many programming languages, for instance Java, JavaScript, PHP, C, C++
Create a regexp that finds them in the string `subject:Java JavaScript PHP C++ C`: Create a regexp that finds them in the string `subject:Java JavaScript PHP C++ C`:
```js ```js
let reg = /your regexp/g; let regexp = /your regexp/g;
alert("Java JavaScript PHP C++ C".match(reg)); // Java JavaScript PHP C++ C alert("Java JavaScript PHP C++ C".match(regexp)); // Java JavaScript PHP C++ C
``` ```

View file

@ -8,7 +8,7 @@ The full pattern: `pattern:\[(b|url|quote)\].*?\[/\1\]`.
In action: In action:
```js run ```js run
let reg = /\[(b|url|quote)\].*?\[\/\1\]/gs; let regexp = /\[(b|url|quote)\].*?\[\/\1\]/gs;
let str = ` let str = `
[b]hello![/b] [b]hello![/b]
@ -17,7 +17,7 @@ let str = `
[/quote] [/quote]
`; `;
alert( str.match(reg) ); // [b]hello![/b],[quote][url]http://google.com[/url][/quote] alert( str.match(regexp) ); // [b]hello![/b],[quote][url]http://google.com[/url][/quote]
``` ```
Please note that we had to escape a slash for the closing tag `pattern:[/\1]`, because normally the slash closes the pattern. Please note that we had to escape a slash for the closing tag `pattern:[/\1]`, because normally the slash closes the pattern.

View file

@ -32,17 +32,17 @@ Create a regexp to find all BB-tags with their contents.
For instance: For instance:
```js ```js
let reg = /your regexp/flags; let regexp = /your regexp/flags;
let str = "..[url]http://google.com[/url].."; let str = "..[url]http://google.com[/url]..";
alert( str.match(reg) ); // [url]http://google.com[/url] alert( str.match(regexp) ); // [url]http://google.com[/url]
``` ```
If tags are nested, then we need the outer tag (if we want we can continue the search in its content): If tags are nested, then we need the outer tag (if we want we can continue the search in its content):
```js ```js
let reg = /your regexp/flags; let regexp = /your regexp/flags;
let str = "..[url][b]http://google.com[/b][/url].."; let str = "..[url][b]http://google.com[/b][/url]..";
alert( str.match(reg) ); // [url][b]http://google.com[/b][/url] alert( str.match(regexp) ); // [url][b]http://google.com[/b][/url]
``` ```

View file

@ -10,8 +10,8 @@ Step by step:
In action: In action:
```js run ```js run
let reg = /"(\\.|[^"\\])*"/g; let regexp = /"(\\.|[^"\\])*"/g;
let str = ' .. "test me" .. "Say \\"Hello\\"!" .. "\\\\ \\"" .. '; let str = ' .. "test me" .. "Say \\"Hello\\"!" .. "\\\\ \\"" .. ';
alert( str.match(reg) ); // "test me","Say \"Hello\"!","\\ \"" alert( str.match(regexp) ); // "test me","Say \"Hello\"!","\\ \""
``` ```

View file

@ -10,7 +10,7 @@ In the regexp language: `pattern:<style(>|\s.*?>)`.
In action: In action:
```js run ```js run
let reg = /<style(>|\s.*?>)/g; let regexp = /<style(>|\s.*?>)/g;
alert( '<style> <styler> <style test="...">'.match(reg) ); // <style>, <style test="..."> alert( '<style> <styler> <style test="...">'.match(regexp) ); // <style>, <style test="...">
``` ```

View file

@ -7,7 +7,7 @@ Write a regexp to find the tag `<style...>`. It should match the full tag: it ma
For instance: For instance:
```js ```js
let reg = /your regexp/g; let regexp = /your regexp/g;
alert( '<style> <styler> <style test="...">'.match(reg) ); // <style>, <style test="..."> alert( '<style> <styler> <style test="...">'.match(regexp) ); // <style>, <style test="...">
``` ```

View file

@ -11,11 +11,11 @@ The corresponding regexp: `pattern:html|php|java(script)?`.
A usage example: A usage example:
```js run ```js run
let reg = /html|php|css|java(script)?/gi; let regexp = /html|php|css|java(script)?/gi;
let str = "First HTML appeared, then CSS, then JavaScript"; let str = "First HTML appeared, then CSS, then JavaScript";
alert( str.match(reg) ); // 'HTML', 'CSS', 'JavaScript' alert( str.match(regexp) ); // 'HTML', 'CSS', 'JavaScript'
``` ```
We already saw a similar thing -- square brackets. They allow to choose between multiple characters, for instance `pattern:gr[ae]y` matches `match:gray` or `match:grey`. We already saw a similar thing -- square brackets. They allow to choose between multiple characters, for instance `pattern:gr[ae]y` matches `match:gray` or `match:grey`.
@ -64,7 +64,7 @@ But that's wrong, the alternation should only be used in the "hours" part of the
The final solution: The final solution:
```js run ```js run
let reg = /([01]\d|2[0-3]):[0-5]\d/g; let regexp = /([01]\d|2[0-3]):[0-5]\d/g;
alert("00:00 10:10 23:59 25:99 1:2".match(reg)); // 00:00,10:10,23:59 alert("00:00 10:10 23:59 25:99 1:2".match(regexp)); // 00:00,10:10,23:59
``` ```

View file

@ -6,11 +6,11 @@ We can exclude negatives by prepending it with the negative lookahead: `pattern:
Although, if we try it now, we may notice one more "extra" result: Although, if we try it now, we may notice one more "extra" result:
```js run ```js run
let reg = /(?<!-)\d+/g; let regexp = /(?<!-)\d+/g;
let str = "0 12 -5 123 -18"; let str = "0 12 -5 123 -18";
console.log( str.match(reg) ); // 0, 12, 123, *!*8*/!* console.log( str.match(regexp) ); // 0, 12, 123, *!*8*/!*
``` ```
As you can see, it matches `match:8`, from `subject:-18`. To exclude it, we need to ensure that the regexp starts matching a number not from the middle of another (non-matching) number. As you can see, it matches `match:8`, from `subject:-18`. To exclude it, we need to ensure that the regexp starts matching a number not from the middle of another (non-matching) number.
@ -20,9 +20,9 @@ We can do it by specifying another negative lookbehind: `pattern:(?<!-)(?<!\d)\d
We can also join them into a single lookbehind here: We can also join them into a single lookbehind here:
```js run ```js run
let reg = /(?<![-\d])\d+/g; let regexp = /(?<![-\d])\d+/g;
let str = "0 12 -5 123 -18"; let str = "0 12 -5 123 -18";
alert( str.match(reg) ); // 0, 12, 123 alert( str.match(regexp) ); // 0, 12, 123
``` ```

View file

@ -6,9 +6,9 @@ Create a regexp that looks for only non-negative ones (zero is allowed).
An example of use: An example of use:
```js ```js
let reg = /your regexp/g; let regexp = /your regexp/g;
let str = "0 12 -5 123 -18"; let str = "0 12 -5 123 -18";
alert( str.match(reg) ); // 0, 12, 123 alert( str.match(regexp) ); // 0, 12, 123
``` ```

View file

@ -7,7 +7,7 @@
Например: Например:
```js ```js
let reg = /ваше регулярное выражение/; let regexp = /ваше регулярное выражение/;
let str = ` let str = `
<html> <html>
@ -17,7 +17,7 @@ let str = `
</html> </html>
`; `;
str = str.replace(reg, `<h1>Hello</h1>`); str = str.replace(regexp, `<h1>Hello</h1>`);
``` ```
После этого значение `str`: После этого значение `str`:

View file

@ -96,18 +96,18 @@ In the example below the currency sign `pattern:(€|kr)` is captured, along wit
```js run ```js run
let str = "1 turkey costs 30€"; let str = "1 turkey costs 30€";
let reg = /\d+(?=(€|kr))/; // extra parentheses around €|kr let regexp = /\d+(?=(€|kr))/; // extra parentheses around €|kr
alert( str.match(reg) ); // 30, € alert( str.match(regexp) ); // 30, €
``` ```
And here's the same for lookbehind: And here's the same for lookbehind:
```js run ```js run
let str = "1 turkey costs $30"; let str = "1 turkey costs $30";
let reg = /(?<=(\$|£))\d+/; let regexp = /(?<=(\$|£))\d+/;
alert( str.match(reg) ); // 30, $ alert( str.match(regexp) ); // 30, $
``` ```
## Summary ## Summary

View file

@ -19,10 +19,10 @@ We'll use a regexp `pattern:^(\w+\s?)*$`, it specifies 0 or more such words.
In action: In action:
```js run ```js run
let reg = /^(\w+\s?)*$/; let regexp = /^(\w+\s?)*$/;
alert( reg.test("A good string") ); // true alert( regexp.test("A good string") ); // true
alert( reg.test("Bad characters: $@#") ); // false alert( regexp.test("Bad characters: $@#") ); // false
``` ```
It seems to work. The result is correct. Although, on certain strings it takes a lot of time. So long that JavaScript engine "hangs" with 100% CPU consumption. It seems to work. The result is correct. Although, on certain strings it takes a lot of time. So long that JavaScript engine "hangs" with 100% CPU consumption.
@ -30,11 +30,11 @@ It seems to work. The result is correct. Although, on certain strings it takes a
If you run the example below, you probably won't see anything, as JavaScript will just "hang". A web-browser will stop reacting on events, the UI will stop working. After some time it will suggest to reloaad the page. So be careful with this: If you run the example below, you probably won't see anything, as JavaScript will just "hang". A web-browser will stop reacting on events, the UI will stop working. After some time it will suggest to reloaad the page. So be careful with this:
```js run ```js run
let reg = /^(\w+\s?)*$/; let regexp = /^(\w+\s?)*$/;
let str = "An input string that takes a long time or even makes this regexp to hang!"; let str = "An input string that takes a long time or even makes this regexp to hang!";
// will take a very long time // will take a very long time
alert( reg.test(str) ); alert( regexp.test(str) );
``` ```
Some regular expression engines can handle such search, but most of them can't. Some regular expression engines can handle such search, but most of them can't.
@ -50,12 +50,12 @@ And, to make things more obvious, let's replace `pattern:\w` with `pattern:\d`.
<!-- let str = `AnInputStringThatMakesItHang!`; --> <!-- let str = `AnInputStringThatMakesItHang!`; -->
```js run ```js run
let reg = /^(\d+)*$/; let regexp = /^(\d+)*$/;
let str = "012345678901234567890123456789!"; let str = "012345678901234567890123456789!";
// will take a very long time // will take a very long time
alert( reg.test(str) ); alert( regexp.test(str) );
``` ```
So what's wrong with the regexp? So what's wrong with the regexp?
@ -189,10 +189,10 @@ Let's rewrite the regular expression as `pattern:^(\w+\s)*\w*` - we'll look for
This regexp is equivalent to the previous one (matches the same) and works well: This regexp is equivalent to the previous one (matches the same) and works well:
```js run ```js run
let reg = /^(\w+\s)*\w*$/; let regexp = /^(\w+\s)*\w*$/;
let str = "An input string that takes a long time or even makes this regex to hang!"; let str = "An input string that takes a long time or even makes this regex to hang!";
alert( reg.test(str) ); // false alert( regexp.test(str) ); // false
``` ```
Why did the problem disappear? Why did the problem disappear?
@ -272,26 +272,26 @@ There's more about the relation between possessive quantifiers and lookahead in
Let's rewrite the first example using lookahead to prevent backtracking: Let's rewrite the first example using lookahead to prevent backtracking:
```js run ```js run
let reg = /^((?=(\w+))\2\s?)*$/; let regexp = /^((?=(\w+))\2\s?)*$/;
alert( reg.test("A good string") ); // true alert( regexp.test("A good string") ); // true
let str = "An input string that takes a long time or even makes this regex to hang!"; let str = "An input string that takes a long time or even makes this regex to hang!";
alert( reg.test(str) ); // false, works and fast! alert( regexp.test(str) ); // false, works and fast!
``` ```
Here `pattern:\2` is used instead of `pattern:\1`, because there are additional outer parentheses. To avoid messing up with the numbers, we can give the parentheses a name, e.g. `pattern:(?<word>\w+)`. Here `pattern:\2` is used instead of `pattern:\1`, because there are additional outer parentheses. To avoid messing up with the numbers, we can give the parentheses a name, e.g. `pattern:(?<word>\w+)`.
```js run ```js run
// parentheses are named ?<word>, referenced as \k<word> // parentheses are named ?<word>, referenced as \k<word>
let reg = /^((?=(?<word>\w+))\k<word>\s?)*$/; let regexp = /^((?=(?<word>\w+))\k<word>\s?)*$/;
let str = "An input string that takes a long time or even makes this regex to hang!"; let str = "An input string that takes a long time or even makes this regex to hang!";
alert( reg.test(str) ); // false alert( regexp.test(str) ); // false
alert( reg.test("A correct string") ); // true alert( regexp.test("A correct string") ); // true
``` ```
The problem described in this article is called "catastrophic backtracking". The problem described in this article is called "catastrophic backtracking".

View file

@ -71,9 +71,9 @@ Usage example:
```js run ```js run
let str = '<h1>Hello, world!</h1>'; let str = '<h1>Hello, world!</h1>';
let reg = /<(.*?)>/g; let regexp = /<(.*?)>/g;
let matchAll = str.matchAll(reg); let matchAll = str.matchAll(regexp);
alert(matchAll); // [object RegExp String Iterator], not array, but an iterable alert(matchAll); // [object RegExp String Iterator], not array, but an iterable
@ -118,7 +118,7 @@ alert( str.search( /ink/i ) ); // 10 (first match position)
If we need positions of further matches, we should use other means, such as finding them all with `str.matchAll(regexp)`. If we need positions of further matches, we should use other means, such as finding them all with `str.matchAll(regexp)`.
## str.replace(str|reg, str|func) ## str.replace(str|regexp, str|func)
This is a generic method for searching and replacing, one of most useful ones. The swiss army knife for searching and replacing. This is a generic method for searching and replacing, one of most useful ones. The swiss army knife for searching and replacing.
@ -238,7 +238,7 @@ The method `regexp.exec(str)` method returns a match for `regexp` in the string
It behaves differently depending on whether the regexp has flag `pattern:g`. It behaves differently depending on whether the regexp has flag `pattern:g`.
If there's no `pattern:g`, then `regexp.exec(str)` returns the first match exactly as `str.match(reg)`. This behavior doesn't bring anything new. If there's no `pattern:g`, then `regexp.exec(str)` returns the first match exactly as `str.match(regexp)`. This behavior doesn't bring anything new.
But if there's flag `pattern:g`, then: But if there's flag `pattern:g`, then:
- A call to `regexp.exec(str)` returns the first match and saves the position immediately after it in the property `regexp.lastIndex`. - A call to `regexp.exec(str)` returns the first match and saves the position immediately after it in the property `regexp.lastIndex`.
@ -272,7 +272,7 @@ For instance:
```js run ```js run
let str = 'Hello, world!'; let str = 'Hello, world!';
let reg = /\w+/g; // without flag "g", lastIndex property is ignored let regexp = /\w+/g; // without flag "g", lastIndex property is ignored
regexp.lastIndex = 5; // search from 5th position (from the comma) regexp.lastIndex = 5; // search from 5th position (from the comma)
alert( regexp.exec(str) ); // world alert( regexp.exec(str) ); // world
@ -285,7 +285,7 @@ Let's replace flag `pattern:g` with `pattern:y` in the example above. There will
```js run ```js run
let str = 'Hello, world!'; let str = 'Hello, world!';
let reg = /\w+/y; let regexp = /\w+/y;
regexp.lastIndex = 5; // search exactly at position 5 regexp.lastIndex = 5; // search exactly at position 5
alert( regexp.exec(str) ); // null alert( regexp.exec(str) ); // null