init
This commit is contained in:
parent
06f61d8ce8
commit
f301cb744d
2271 changed files with 103162 additions and 0 deletions
|
@ -0,0 +1,69 @@
|
|||
# Word boundary
|
||||
|
||||
Another position check is a *word boundary* <code class="pattern">\b</code>. It doesn't match a character, but matches in situations when a wordly character follows a non-wordly or vice versa. A "non-wordly" may also be text start or end.
|
||||
[cut]
|
||||
For example, <code class=pattern">\bdog\b</code> matches a standalone <code class="subject">dog</code>, not <code class="subject">doggy</code> or <code class="subject">catdog</code>:
|
||||
|
||||
```js
|
||||
//+ run
|
||||
showMatch( "doggy catdog dog", /\bdog\b/ ) // "dog"
|
||||
```
|
||||
|
||||
Here, <code class="match">dog</code> matches, because the previous char is a space (non-wordly), and the next position is text end.
|
||||
|
||||
Normally, <code class="pattern">\w{4}</code> matches 4 consequent word characters.
|
||||
If the word is long enough, it may match multiple times:
|
||||
|
||||
```js
|
||||
//+ run
|
||||
showMatch( "Boombaroom", /\w{4}/g) // 'Boom', 'baro'
|
||||
```
|
||||
|
||||
Appending <code class="pattern">\b</code> causes <code class="pattern">\w{4}\b</code> to match only at word end:
|
||||
|
||||
```js
|
||||
//+ run
|
||||
showMatch( "Because life is awesome", /\w{4}\b/g) // 'ause', 'life', 'some'
|
||||
```
|
||||
|
||||
**The word boundary <code class="pattern">\b</code> like <code class="pattern">^</code> and <code class="pattern">$</code> doesn't match a char. It only performs the check.**
|
||||
|
||||
Let's add the check from another side, <code class="pattern">\b\w{4}\b</code>:
|
||||
|
||||
```js
|
||||
//+ run
|
||||
showMatch( "Because life is awesome", /\b\w{4}\b/g) // 'life'
|
||||
```
|
||||
|
||||
Now there is only one result <code class="match">life</code>.
|
||||
|
||||
<ol>
|
||||
<li>The regexp engine matches first word boundary <code class="pattern">\b</code> at zero position:
|
||||
<img src="boundary1.png">
|
||||
</li>
|
||||
<li>Then it successfully matches <code class="pattern">\w{4}</code>, but fails to match finishing <code class="pattern">\b</code>.
|
||||
|
||||
<img src="boundary2.png">
|
||||
|
||||
So, the match at position zero fails.
|
||||
</li>
|
||||
<li>The search continues from position 1, and the closest <code class="pattern">\b</code> is right after <code class="subject">Because</code> (position 9):
|
||||
|
||||
<img src="boundary3.png">
|
||||
|
||||
Now <code class="pattern">\w{4}</code> doesn't match, because the next character is a space.
|
||||
</li>
|
||||
<li>The search continues, and the closest <code class="pattern">\b</code> is right before <code class="subject">life</code> at position 11.
|
||||
|
||||
<img src="boundary4.png">
|
||||
|
||||
Finally, <code class="pattern">\w{4}</code> matches and the position check <code class="pattern">\b</code> after it is positive. We've got the result.
|
||||
</li>
|
||||
<li>The search continues after the match, but doesn't yield new results.</li>
|
||||
</ol>
|
||||
|
||||
**The word boundary check <code class="pattern">/\b/</code> works only for words in latin alphabet,** because it is based on <code class="pattern">\w</code> as "wordly" chars. Sometimes that's acceptable, but limits the application range of the feature.
|
||||
|
||||
And, for completeness..
|
||||
**There is also an inverse check <code class="pattern">\B</code>, meaning a position other than <code class="pattern">\b</code>.** It is extremely rarely used.
|
||||
|
BIN
03-more/08-regular-expressions-javascript/15-regexp-word-boundary/boundary1.png
Executable file
BIN
03-more/08-regular-expressions-javascript/15-regexp-word-boundary/boundary1.png
Executable file
Binary file not shown.
After Width: | Height: | Size: 4.6 KiB |
BIN
03-more/08-regular-expressions-javascript/15-regexp-word-boundary/boundary2.png
Executable file
BIN
03-more/08-regular-expressions-javascript/15-regexp-word-boundary/boundary2.png
Executable file
Binary file not shown.
After Width: | Height: | Size: 6.3 KiB |
BIN
03-more/08-regular-expressions-javascript/15-regexp-word-boundary/boundary3.png
Executable file
BIN
03-more/08-regular-expressions-javascript/15-regexp-word-boundary/boundary3.png
Executable file
Binary file not shown.
After Width: | Height: | Size: 6.2 KiB |
BIN
03-more/08-regular-expressions-javascript/15-regexp-word-boundary/boundary4.png
Executable file
BIN
03-more/08-regular-expressions-javascript/15-regexp-word-boundary/boundary4.png
Executable file
Binary file not shown.
After Width: | Height: | Size: 6.3 KiB |
Loading…
Add table
Add a link
Reference in a new issue