Replace unicode with Unicode all over the book

This commit is contained in:
Vse Mozhe Buty 2020-12-07 18:05:47 +02:00
parent e87f130fc1
commit 7c73f64a13
6 changed files with 23 additions and 23 deletions

View file

@ -57,16 +57,16 @@ For instance:
- **\d** -- is the same as `pattern:[0-9]`,
- **\w** -- is the same as `pattern:[a-zA-Z0-9_]`,
- **\s** -- is the same as `pattern:[\t\n\v\f\r ]`, plus few other rare unicode space characters.
- **\s** -- is the same as `pattern:[\t\n\v\f\r ]`, plus few other rare Unicode space characters.
```
### Example: multi-language \w
As the character class `pattern:\w` is a shorthand for `pattern:[a-zA-Z0-9_]`, it can't find Chinese hieroglyphs, Cyrillic letters, etc.
We can write a more universal pattern, that looks for wordly characters in any language. That's easy with unicode properties: `pattern:[\p{Alpha}\p{M}\p{Nd}\p{Pc}\p{Join_C}]`.
We can write a more universal pattern, that looks for wordly characters in any language. That's easy with Unicode properties: `pattern:[\p{Alpha}\p{M}\p{Nd}\p{Pc}\p{Join_C}]`.
Let's decipher it. Similar to `pattern:\w`, we're making a set of our own that includes characters with following unicode properties:
Let's decipher it. Similar to `pattern:\w`, we're making a set of our own that includes characters with following Unicode properties:
- `Alphabetic` (`Alpha`) - for letters,
- `Mark` (`M`) - for accents,
@ -85,7 +85,7 @@ let str = `Hi 你好 12`;
alert( str.match(regexp) ); // H,i,你,好,1,2
```
Of course, we can edit this pattern: add unicode properties or remove them. Unicode properties are covered in more details in the article <info:regexp-unicode>.
Of course, we can edit this pattern: add Unicode properties or remove them. Unicode properties are covered in more details in the article <info:regexp-unicode>.
```warn header="Unicode properties aren't supported in Edge and Firefox"
Unicode properties `pattern:p{…}` are not yet implemented in Edge and Firefox. If we really need them, we can use library [XRegExp](http://xregexp.com/).