components
This commit is contained in:
parent
304d578b54
commit
6fb4aabcba
344 changed files with 669 additions and 406 deletions
|
@ -0,0 +1,29 @@
|
|||
A regexp to search 3-digit color `#abc`: `pattern:/#[a-f0-9]{3}/i`.
|
||||
|
||||
We can add exactly 3 more optional hex digits. We don't need more or less. Either we have them or we don't.
|
||||
|
||||
The simplest way to add them -- is to append to the regexp: `pattern:/#[a-f0-9]{3}([a-f0-9]{3})?/i`
|
||||
|
||||
We can do it in a smarter way though: `pattern:/#([a-f0-9]{3}){1,2}/i`.
|
||||
|
||||
Here the regexp `pattern:[a-f0-9]{3}` is in parentheses to apply the quantifier `pattern:{1,2}` to it as a whole.
|
||||
|
||||
In action:
|
||||
|
||||
```js run
|
||||
let reg = /#([a-f0-9]{3}){1,2}/gi;
|
||||
|
||||
let str = "color: #3f3; background-color: #AA00ef; and: #abcd";
|
||||
|
||||
alert( str.match(reg) ); // #3f3 #AA00ef #abc
|
||||
```
|
||||
|
||||
There's a minor problem here: the pattern found `match:#abc` in `subject:#abcd`. To prevent that we can add `pattern:\b` to the end:
|
||||
|
||||
```js run
|
||||
let reg = /#([a-f0-9]{3}){1,2}\b/gi;
|
||||
|
||||
let str = "color: #3f3; background-color: #AA00ef; and: #abcd";
|
||||
|
||||
alert( str.match(reg) ); // #3f3 #AA00ef
|
||||
```
|
|
@ -0,0 +1,14 @@
|
|||
# Find color in the format #abc or #abcdef
|
||||
|
||||
Write a RegExp that matches colors in the format `#abc` or `#abcdef`. That is: `#` followed by 3 or 6 hexadecimal digits.
|
||||
|
||||
Usage example:
|
||||
```js
|
||||
let reg = /your regexp/g;
|
||||
|
||||
let str = "color: #3f3; background-color: #AA00ef; and: #abcd";
|
||||
|
||||
alert( str.match(reg) ); // #3f3 #AA00ef
|
||||
```
|
||||
|
||||
P.S. This should be exactly 3 or 6 hex digits: values like `#abcd` should not match.
|
|
@ -0,0 +1,18 @@
|
|||
|
||||
An non-negative integer number is `pattern:\d+`. We should exclude `0` as the first digit, as we don't need zero, but we can allow it in further digits.
|
||||
|
||||
So that gives us `pattern:[1-9]\d*`.
|
||||
|
||||
A decimal part is: `pattern:\.\d+`.
|
||||
|
||||
Because the decimal part is optional, let's put it in parentheses with the quantifier `pattern:'?'`.
|
||||
|
||||
Finally we have the regexp: `pattern:[1-9]\d*(\.\d+)?`:
|
||||
|
||||
```js run
|
||||
let reg = /[1-9]\d*(\.\d+)?/g;
|
||||
|
||||
let str = "1.5 0 -5 12. 123.4.";
|
||||
|
||||
alert( str.match(reg) ); // 1.5, 0, 12, 123.4
|
||||
```
|
|
@ -0,0 +1,12 @@
|
|||
# Find positive numbers
|
||||
|
||||
Create a regexp that looks for positive numbers, including those without a decimal point.
|
||||
|
||||
An example of use:
|
||||
```js
|
||||
let reg = /your regexp/g;
|
||||
|
||||
let str = "1.5 0 -5 12. 123.4.";
|
||||
|
||||
alert( str.match(reg) ); // 1.5, 12, 123.4 (ignores 0 and -5)
|
||||
```
|
|
@ -0,0 +1,11 @@
|
|||
A positive number with an optional decimal part is (per previous task): `pattern:\d+(\.\d+)?`.
|
||||
|
||||
Let's add an optional `-` in the beginning:
|
||||
|
||||
```js run
|
||||
let reg = /-?\d+(\.\d+)?/g;
|
||||
|
||||
let str = "-1.5 0 2 -123.4.";
|
||||
|
||||
alert( str.match(reg) ); // -1.5, 0, 2, -123.4
|
||||
```
|
|
@ -0,0 +1,13 @@
|
|||
# Find all numbers
|
||||
|
||||
Write a regexp that looks for all decimal numbers including integer ones, with the floating point and negative ones.
|
||||
|
||||
An example of use:
|
||||
|
||||
```js
|
||||
let reg = /your regexp/g;
|
||||
|
||||
let str = "-1.5 0 2 -123.4.";
|
||||
|
||||
alert( str.match(re) ); // -1.5, 0, 2, -123.4
|
||||
```
|
|
@ -0,0 +1,51 @@
|
|||
A regexp for a number is: `pattern:-?\d+(\.\d+)?`. We created it in previous tasks.
|
||||
|
||||
An operator is `pattern:[-+*/]`. We put the dash `pattern:-` first, because in the middle it would mean a character range, we don't need that.
|
||||
|
||||
Note that a slash should be escaped inside a JavaScript regexp `pattern:/.../`.
|
||||
|
||||
We need a number, an operator, and then another number. And optional spaces between them.
|
||||
|
||||
The full regular expression: `pattern:-?\d+(\.\d+)?\s*[-+*/]\s*-?\d+(\.\d+)?`.
|
||||
|
||||
To get a result as an array let's put parentheses around the data that we need: numbers and the operator: `pattern:(-?\d+(\.\d+)?)\s*([-+*/])\s*(-?\d+(\.\d+)?)`.
|
||||
|
||||
In action:
|
||||
|
||||
```js run
|
||||
let reg = /(-?\d+(\.\d+)?)\s*([-+*\/])\s*(-?\d+(\.\d+)?)/;
|
||||
|
||||
alert( "1.2 + 12".match(reg) );
|
||||
```
|
||||
|
||||
The result includes:
|
||||
|
||||
- `result[0] == "1.2 + 12"` (full match)
|
||||
- `result[1] == "1.2"` (first group `(-?\d+(\.\d+)?)` -- the first number, including the decimal part)
|
||||
- `result[2] == ".2"` (second group`(\.\d+)?` -- the first decimal part)
|
||||
- `result[3] == "+"` (third group `([-+*\/])` -- the operator)
|
||||
- `result[4] == "12"` (forth group `(-?\d+(\.\d+)?)` -- the second number)
|
||||
- `result[5] == undefined` (fifth group `(\.\d+)?` -- the last decimal part is absent, so it's undefined)
|
||||
|
||||
We only want the numbers and the operator, without the full match or the decimal parts.
|
||||
|
||||
The full match (the arrays first item) can be removed by shifting the array `pattern:result.shift()`.
|
||||
|
||||
The decimal groups can be removed by making them into non-capturing groups, by adding `pattern:?:` to the beginning: `pattern:(?:\.\d+)?`.
|
||||
|
||||
The final solution:
|
||||
|
||||
```js run
|
||||
function parse(expr) {
|
||||
let reg = /(-?\d+(?:\.\d+)?)\s*([-+*\/])\s*(-?\d+(?:\.\d+)?)/;
|
||||
|
||||
let result = expr.match(reg);
|
||||
|
||||
if (!result) return [];
|
||||
result.shift();
|
||||
|
||||
return result;
|
||||
}
|
||||
|
||||
alert( parse("-1.23 * 3.45") ); // -1.23, *, 3.45
|
||||
```
|
|
@ -0,0 +1,28 @@
|
|||
# Parse an expression
|
||||
|
||||
An arithmetical expression consists of 2 numbers and an operator between them, for instance:
|
||||
|
||||
- `1 + 2`
|
||||
- `1.2 * 3.4`
|
||||
- `-3 / -6`
|
||||
- `-2 - 2`
|
||||
|
||||
The operator is one of: `"+"`, `"-"`, `"*"` or `"/"`.
|
||||
|
||||
There may be extra spaces at the beginning, at the end or between the parts.
|
||||
|
||||
Create a function `parse(expr)` that takes an expression and returns an array of 3 items:
|
||||
|
||||
1. The first number.
|
||||
2. The operator.
|
||||
3. The second number.
|
||||
|
||||
For example:
|
||||
|
||||
```js
|
||||
let [a, op, b] = parse("1.2 * 3.4");
|
||||
|
||||
alert(a); // 1.2
|
||||
alert(op); // *
|
||||
alert(b); // 3.4
|
||||
```
|
237
9-regular-expressions/09-regexp-groups/article.md
Normal file
237
9-regular-expressions/09-regexp-groups/article.md
Normal file
|
@ -0,0 +1,237 @@
|
|||
# Capturing groups
|
||||
|
||||
A part of a pattern can be enclosed in parentheses `pattern:(...)`. This is called a "capturing group".
|
||||
|
||||
That has two effects:
|
||||
|
||||
1. It allows to place a part of the match into a separate array.
|
||||
2. If we put a quantifier after the parentheses, it applies to the parentheses as a whole, not the last character.
|
||||
|
||||
## Example
|
||||
|
||||
In the example below the pattern `pattern:(go)+` finds one or more `match:'go'`:
|
||||
|
||||
```js run
|
||||
alert( 'Gogogo now!'.match(/(go)+/i) ); // "Gogogo"
|
||||
```
|
||||
|
||||
Without parentheses, the pattern `pattern:/go+/` means `subject:g`, followed by `subject:o` repeated one or more times. For instance, `match:goooo` or `match:gooooooooo`.
|
||||
|
||||
Parentheses group the word `pattern:(go)` together.
|
||||
|
||||
Let's make something more complex -- a regexp to match an email.
|
||||
|
||||
Examples of emails:
|
||||
|
||||
```
|
||||
my@mail.com
|
||||
john.smith@site.com.uk
|
||||
```
|
||||
|
||||
The pattern: `pattern:[-.\w]+@([\w-]+\.)+[\w-]{2,20}`.
|
||||
|
||||
1. The first part `pattern:[-.\w]+` (before `@`) may include any alphanumeric word characters, a dot and a dash, to match `match:john.smith`.
|
||||
2. Then `pattern:@`, and the domain. It may be a subdomain like `host.site.com.uk`, so we match it as "a word followed by a dot `pattern:([\w-]+\.)` (repeated), and then the last part must be a word: `match:com` or `match:uk` (but not very long: 2-20 characters).
|
||||
|
||||
That regexp is not perfect, but good enough to fix errors or occasional mistypes.
|
||||
|
||||
For instance, we can find all emails in the string:
|
||||
|
||||
```js run
|
||||
let reg = /[-.\w]+@([\w-]+\.)+[\w-]{2,20}/g;
|
||||
|
||||
alert("my@mail.com @ his@site.com.uk".match(reg)); // my@mail.com, his@site.com.uk
|
||||
```
|
||||
|
||||
In this example parentheses were used to make a group for repeating `pattern:(...)+`. But there are other uses too, let's see them.
|
||||
|
||||
## Contents of parentheses
|
||||
|
||||
Parentheses are numbered from left to right. The search engine remembers the content of each and allows to reference it in the pattern or in the replacement string.
|
||||
|
||||
For instance, we'd like to find HTML tags `pattern:<.*?>`, and process them.
|
||||
|
||||
Let's wrap the inner content into parentheses, like this: `pattern:<(.*?)>`.
|
||||
|
||||
We'll get them into an array:
|
||||
|
||||
```js run
|
||||
let str = '<h1>Hello, world!</h1>';
|
||||
let reg = /<(.*?)>/;
|
||||
|
||||
alert( str.match(reg) ); // Array: ["<h1>", "h1"]
|
||||
```
|
||||
|
||||
The call to [String#match](mdn:js/String/match) returns groups only if the regexp has no `pattern:/.../g` flag.
|
||||
|
||||
If we need all matches with their groups then we can use `.matchAll` or `regexp.exec` as described in <info:regexp-methods>:
|
||||
|
||||
```js run
|
||||
let str = '<h1>Hello, world!</h1>';
|
||||
|
||||
// two matches: opening <h1> and closing </h1> tags
|
||||
let reg = /<(.*?)>/g;
|
||||
|
||||
let matches = Array.from( str.matchAll(reg) );
|
||||
|
||||
alert(matches[0]); // Array: ["<h1>", "h1"]
|
||||
alert(matches[1]); // Array: ["</h1>", "/h1"]
|
||||
```
|
||||
|
||||
Here we have two matches for `pattern:<(.*?)>`, each of them is an array with the full match and groups.
|
||||
|
||||
## Nested groups
|
||||
|
||||
Parentheses can be nested. In this case the numbering also goes from left to right.
|
||||
|
||||
For instance, when searching a tag in `subject:<span class="my">` we may be interested in:
|
||||
|
||||
1. The tag content as a whole: `match:span class="my"`.
|
||||
2. The tag name: `match:span`.
|
||||
3. The tag attributes: `match:class="my"`.
|
||||
|
||||
Let's add parentheses for them:
|
||||
|
||||
```js run
|
||||
let str = '<span class="my">';
|
||||
|
||||
let reg = /<(([a-z]+)\s*([^>]*))>/;
|
||||
|
||||
let result = str.match(reg);
|
||||
alert(result); // <span class="my">, span class="my", span, class="my"
|
||||
```
|
||||
|
||||
Here's how groups look:
|
||||
|
||||

|
||||
|
||||
At the zero index of the `result` is always the full match.
|
||||
|
||||
Then groups, numbered from left to right. Whichever opens first gives the first group `result[1]`. Here it encloses the whole tag content.
|
||||
|
||||
Then in `result[2]` goes the group from the second opening `pattern:(` till the corresponding `pattern:)` -- tag name, then we don't group spaces, but group attributes for `result[3]`.
|
||||
|
||||
**If a group is optional and doesn't exist in the match, the corresponding `result` index is present (and equals `undefined`).**
|
||||
|
||||
For instance, let's consider the regexp `pattern:a(z)?(c)?`. It looks for `"a"` optionally followed by `"z"` optionally followed by `"c"`.
|
||||
|
||||
If we run it on the string with a single letter `subject:a`, then the result is:
|
||||
|
||||
```js run
|
||||
let match = 'a'.match(/a(z)?(c)?/);
|
||||
|
||||
alert( match.length ); // 3
|
||||
alert( match[0] ); // a (whole match)
|
||||
alert( match[1] ); // undefined
|
||||
alert( match[2] ); // undefined
|
||||
```
|
||||
|
||||
The array has the length of `3`, but all groups are empty.
|
||||
|
||||
And here's a more complex match for the string `subject:ack`:
|
||||
|
||||
```js run
|
||||
let match = 'ack'.match(/a(z)?(c)?/)
|
||||
|
||||
alert( match.length ); // 3
|
||||
alert( match[0] ); // ac (whole match)
|
||||
alert( match[1] ); // undefined, because there's nothing for (z)?
|
||||
alert( match[2] ); // c
|
||||
```
|
||||
|
||||
The array length is permanent: `3`. But there's nothing for the group `pattern:(z)?`, so the result is `["ac", undefined, "c"]`.
|
||||
|
||||
## Named groups
|
||||
|
||||
Remembering groups by their numbers is hard. For simple patterns it's doable, but for more complex ones we can give names to parentheses.
|
||||
|
||||
That's done by putting `pattern:?<name>` immediately after the opening paren, like this:
|
||||
|
||||
```js run
|
||||
*!*
|
||||
let dateRegexp = /(?<year>[0-9]{4})-(?<month>[0-9]{2})-(?<day>[0-9]{2})/;
|
||||
*/!*
|
||||
let str = "2019-04-30";
|
||||
|
||||
let groups = str.match(dateRegexp).groups;
|
||||
|
||||
alert(groups.year); // 2019
|
||||
alert(groups.month); // 04
|
||||
alert(groups.day); // 30
|
||||
```
|
||||
|
||||
As you can see, the groups reside in the `.groups` property of the match.
|
||||
|
||||
Wee can also use them in replacements, as `pattern:$<name>` (like `$1..9`, but name instead of a digit).
|
||||
|
||||
For instance, let's rearrange the date into `day.month.year`:
|
||||
|
||||
```js run
|
||||
let dateRegexp = /(?<year>[0-9]{4})-(?<month>[0-9]{2})-(?<day>[0-9]{2})/;
|
||||
|
||||
let str = "2019-04-30";
|
||||
|
||||
let rearranged = str.replace(dateRegexp, '$<day>.$<month>.$<year>');
|
||||
|
||||
alert(rearranged); // 30.04.2019
|
||||
```
|
||||
|
||||
If we use a function, then named `groups` object is always the last argument:
|
||||
|
||||
```js run
|
||||
let dateRegexp = /(?<year>[0-9]{4})-(?<month>[0-9]{2})-(?<day>[0-9]{2})/;
|
||||
|
||||
let str = "2019-04-30";
|
||||
|
||||
let rearranged = str.replace(dateRegexp,
|
||||
(str, year, month, day, offset, input, groups) =>
|
||||
`${groups.day}.${groups.month}.${groups.year}`
|
||||
);
|
||||
|
||||
alert(rearranged); // 30.04.2019
|
||||
```
|
||||
|
||||
Usually, when we intend to use named groups, we don't need positional arguments of the function. For the majority of real-life cases we only need `str` and `groups`.
|
||||
|
||||
So we can write it a little bit shorter:
|
||||
|
||||
```js
|
||||
let rearranged = str.replace(dateRegexp, (str, ...args) => {
|
||||
let {year, month, day} = args.pop();
|
||||
alert(str); // 2019-04-30
|
||||
alert(year); // 2019
|
||||
alert(month); // 04
|
||||
alert(day); // 30
|
||||
});
|
||||
```
|
||||
|
||||
|
||||
## Non-capturing groups with ?:
|
||||
|
||||
Sometimes we need parentheses to correctly apply a quantifier, but we don't want the contents in results.
|
||||
|
||||
A group may be excluded by adding `pattern:?:` in the beginning.
|
||||
|
||||
For instance, if we want to find `pattern:(go)+`, but don't want to remember the contents (`go`) in a separate array item, we can write: `pattern:(?:go)+`.
|
||||
|
||||
In the example below we only get the name "John" as a separate member of the `results` array:
|
||||
|
||||
```js run
|
||||
let str = "Gogo John!";
|
||||
*!*
|
||||
// exclude Gogo from capturing
|
||||
let reg = /(?:go)+ (\w+)/i;
|
||||
*/!*
|
||||
|
||||
let result = str.match(reg);
|
||||
|
||||
alert( result.length ); // 2
|
||||
alert( result[1] ); // John
|
||||
```
|
||||
|
||||
## Summary
|
||||
|
||||
- Parentheses can be:
|
||||
- capturing `(...)`, ordered left-to-right, accessible by number.
|
||||
- named capturing `(?<name>...)`, accessible by name.
|
||||
- non-capturing `(?:...)`, used only to apply quantifier to the whole groups.
|
BIN
9-regular-expressions/09-regexp-groups/regexp-nested-groups.png
Normal file
BIN
9-regular-expressions/09-regexp-groups/regexp-nested-groups.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 12 KiB |
Binary file not shown.
After Width: | Height: | Size: 25 KiB |
Loading…
Add table
Add a link
Reference in a new issue