up
This commit is contained in:
parent
91e4c89773
commit
1e2b09b6fb
12 changed files with 156 additions and 100 deletions
|
@ -0,0 +1,23 @@
|
|||
|
||||
Opening tag is `pattern:\[(b|url|quote)\]`.
|
||||
|
||||
Then to find everything till the closing tag -- let's the pattern `pattern:[\s\S]*?` to match any character including the newline and then a backreference to the closing tag.
|
||||
|
||||
The full pattern: `pattern:\[(b|url|quote)\][\s\S]*?\[/\1\]`.
|
||||
|
||||
In action:
|
||||
|
||||
```js run
|
||||
let reg = /\[(b|url|quote)\][\s\S]*?\[\/\1\]/g;
|
||||
|
||||
let str = `
|
||||
[b]hello![/b]
|
||||
[quote]
|
||||
[url]http://google.com[/url]
|
||||
[/quote]
|
||||
`;
|
||||
|
||||
alert( str.match(reg) ); // [b]hello![/b],[quote][url]http://google.com[/url][/quote]
|
||||
```
|
||||
|
||||
Please note that we had to escape a slash for the closing tag `pattern:[/\1]`, because normally the slash closes the pattern.
|
|
@ -0,0 +1,48 @@
|
|||
# Find bbtag pairs
|
||||
|
||||
A "bb-tag" looks like `[tag]...[/tag]`, where `tag` is one of: `b`, `url` or `quote`.
|
||||
|
||||
For instance:
|
||||
```
|
||||
[b]текст[/b]
|
||||
[url]http://google.com[/url]
|
||||
```
|
||||
|
||||
BB-tags can be nested. But a tag can't be nested into itself, for instance:
|
||||
|
||||
```
|
||||
Normal:
|
||||
[url] [b]http://google.com[/b] [/url]
|
||||
[quote] [b]text[/b] [/quote]
|
||||
|
||||
Impossible:
|
||||
[b][b]text[/b][/b]
|
||||
```
|
||||
|
||||
Tags can contain line breaks, that's normal:
|
||||
|
||||
```
|
||||
[quote]
|
||||
[b]text[/b]
|
||||
[/quote]
|
||||
```
|
||||
|
||||
Create a regexp to find all BB-tags with their contents.
|
||||
|
||||
For instance:
|
||||
|
||||
```js
|
||||
let reg = /your regexp/g;
|
||||
|
||||
let str = "..[url]http://google.com[/url]..";
|
||||
alert( str.match(reg) ); // [url]http://google.com[/url]
|
||||
```
|
||||
|
||||
If tags are nested, then we need the outer tag (if we want we can continue the search in its content):
|
||||
|
||||
```js
|
||||
let reg = /your regexp/g;
|
||||
|
||||
let str = "..[url][b]http://google.com[/b][/url]..";
|
||||
alert( str.match(reg) ); // [url][b]http://google.com[/b][/url]
|
||||
```
|
|
@ -1,25 +1,72 @@
|
|||
# Альтернация (или) |
|
||||
# Alternation (OR) |
|
||||
|
||||
Альтернация -- термин в регулярных выражениях, которому в русском языке соответствует слово "ИЛИ". Она обозначается символом вертикальной черты `pattern:|` и позволяет выбирать между вариантами.
|
||||
Alternation is the term in regular expression that is actually a simple "OR".
|
||||
|
||||
In a regular expression it is denoted with a vertial line character `pattern:|`.
|
||||
|
||||
[cut]
|
||||
|
||||
Например, нам нужно найти языки программирования: HTML, PHP, Java и JavaScript.
|
||||
For instance, we need to find programming languages: HTML, PHP, Java or JavaScript.
|
||||
|
||||
Соответствующее регулярное выражение: `pattern:html|php|java(script)?`.
|
||||
The corresponding regexp: `pattern:html|php|java(script)?`.
|
||||
|
||||
Пример использования:
|
||||
A usage example:
|
||||
|
||||
```js run
|
||||
var reg = /html|php|css|java(script)?/gi
|
||||
let reg = /html|php|css|java(script)?/gi;
|
||||
|
||||
var str = "Сначала появился HTML, затем CSS, потом JavaScript"
|
||||
let str = "First HTML appeared, then CSS, then JavaScript";
|
||||
|
||||
alert( str.match(reg) ) // 'HTML', 'CSS', 'JavaScript'
|
||||
alert( str.match(reg) ); // 'HTML', 'CSS', 'JavaScript'
|
||||
```
|
||||
|
||||
Мы уже знаем похожую вещь -- квадратные скобки. Они позволяют выбирать между символами, например `pattern:gr[ae]y` найдёт `match:gray`, либо `match:grey`.
|
||||
We already know a similar thing -- square brackets. They allow to choose between multiple character, for instance `pattern:gr[ae]y` matches `match:gray` or `match:grey`.
|
||||
|
||||
Альтернация работает уже не посимвольно, а на уровне фраз и подвыражений. Регэксп `pattern:A|B|C` обозначает поиск одного из выражений: `A`, `B` или `C`, причём в качестве выражений могут быть другие, сколь угодно сложные регэкспы.
|
||||
Alternation works not on a character level, but on expression level. A regexp `pattern:A|B|C` means one of expressions `A`, `B` or `C`.
|
||||
|
||||
Для указания границ альтернации используют скобки `(...)`, например: `pattern:before(XXX|YYY)after` будет искать `match:beforeXXXafter` или `match:beforeYYYafter`.
|
||||
For instance:
|
||||
|
||||
- `pattern:gr(a|e)y` means exactly the same as `pattern:gr[ae]y`.
|
||||
- `pattern:gra|ey` means "gra" or "ey".
|
||||
|
||||
To separate a part of the pattern for alternation we usually enclose it in parentheses, like this: `pattern:before(XXX|YYY)after`.
|
||||
|
||||
## Regexp for time
|
||||
|
||||
In previous chapters there was a task to build a regexp for searching time in the form `hh:mm`, for instance `12:00`. But a simple `pattern:\d\d:\d\d` is too vague. It accepts `25:99` as the time.
|
||||
|
||||
How can we make a better one?
|
||||
|
||||
We can apply more careful matching:
|
||||
|
||||
- The first digit must be `0` or `1` followed by any digit.
|
||||
- Or `2` followed by `pattern:[0-3]`
|
||||
|
||||
As a regexp: `pattern:[01]\d|2[0-3]`.
|
||||
|
||||
Then we can add a colon and the minutes part.
|
||||
|
||||
The minutes must be from `0` to `59`, in the regexp language that means the first digit `pattern:[0-5]` followed by any other digit `\d`.
|
||||
|
||||
Let's glue them together into the pattern: `pattern:[01]\d|2[0-3]:[0-5]\d`.
|
||||
|
||||
We're almost done, but there's a problem. The alternation `|` is between the `pattern:[01]\d` and `pattern:2[0-3]:[0-5]\d`. That's wrong, because it will match either the left or the right pattern:
|
||||
|
||||
|
||||
```js run
|
||||
let reg = /[01]\d|2[0-3]:[0-5]\d/g;
|
||||
|
||||
alert("12".match(reg)); // 12 (matched [01]\d)
|
||||
```
|
||||
|
||||
That's rather obvious, but still an often mistake when starting to work with regular expressions.
|
||||
|
||||
We need to add parentheses to apply alternation exactly to hours: `[01]\d` OR `2[0-3]`.
|
||||
|
||||
The correct variant:
|
||||
|
||||
```js run
|
||||
let reg = /([01]\d|2[0-3]):[0-5]\d/g;
|
||||
|
||||
alert("00:00 10:10 23:59 25:99 1:2".match(reg)); // 00:00,10:10,23:59
|
||||
```
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue