From c1f5d2699e3319514228bfeb4fdc25b38ff5b341 Mon Sep 17 00:00:00 2001 From: Daniel Date: Fri, 28 Jun 2019 14:41:55 -0400 Subject: [PATCH 1/3] Correct spelling --- .../14-regexp-lookahead-lookbehind/article.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/9-regular-expressions/14-regexp-lookahead-lookbehind/article.md b/9-regular-expressions/14-regexp-lookahead-lookbehind/article.md index ec13cedc..e877cae4 100644 --- a/9-regular-expressions/14-regexp-lookahead-lookbehind/article.md +++ b/9-regular-expressions/14-regexp-lookahead-lookbehind/article.md @@ -40,7 +40,7 @@ The syntax is: - Positive lookbehind: `pattern:(?<=y)x`, matches `pattern:x`, but only if it follows after `pattern:y`. - Negative lookbehind: `pattern:(? Date: Fri, 28 Jun 2019 15:19:36 -0400 Subject: [PATCH 2/3] Fix spelling and grammar --- .../15-regexp-infinite-backtracking-problem/article.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/9-regular-expressions/15-regexp-infinite-backtracking-problem/article.md b/9-regular-expressions/15-regexp-infinite-backtracking-problem/article.md index 9eb1824d..9da262b9 100644 --- a/9-regular-expressions/15-regexp-infinite-backtracking-problem/article.md +++ b/9-regular-expressions/15-regexp-infinite-backtracking-problem/article.md @@ -112,7 +112,7 @@ First, one may notice that the regexp is a little bit strange. The quantifier `p Indeed, the regexp is artificial. But the reason why it is slow is the same as those we saw above. So let's understand it, and then the previous example will become obvious. -What happen during the search of `pattern:(\d+)*$` in the line `subject:123456789z`? +What happened during the search of `pattern:(\d+)*$` in the line `subject:123456789z`? 1. First, the regexp engine tries to find a number `pattern:\d+`. The plus `pattern:+` is greedy by default, so it consumes all digits: @@ -264,7 +264,9 @@ In other words: - The lookahead `pattern:?=` looks for the maximal count `pattern:a+` from the current position. - And then they are "consumed into the result" by the backreference `pattern:\1` (`pattern:\1` corresponds to the content of the second parentheses, that is `pattern:a+`). -There will be no backtracking, because lookahead does not backtrack. If it found like 5 times of `pattern:a+` and the further match failed, then it doesn't go back to 4. +There will be no backtracking, because lookahead does not backtrack. If, for +example, it found 5 instances of `pattern:a+` and the further match failed, +it won't go back to the 4th instance. ```smart There's more about the relation between possessive quantifiers and lookahead in articles [Regex: Emulate Atomic Grouping (and Possessive Quantifiers) with LookAhead](http://instanceof.me/post/52245507631/regex-emulate-atomic-grouping-with-lookahead) and [Mimicking Atomic Groups](http://blog.stevenlevithan.com/archives/mimic-atomic-groups). From 759ffd7e4e3c7e3d6eac5c583de2b13f9144a045 Mon Sep 17 00:00:00 2001 From: Daniel Date: Fri, 28 Jun 2019 15:37:47 -0400 Subject: [PATCH 3/3] Fix grammar --- 9-regular-expressions/20-regexp-unicode/article.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/9-regular-expressions/20-regexp-unicode/article.md b/9-regular-expressions/20-regexp-unicode/article.md index 95fac0fb..7eb3a1e0 100644 --- a/9-regular-expressions/20-regexp-unicode/article.md +++ b/9-regular-expressions/20-regexp-unicode/article.md @@ -5,7 +5,7 @@ The unicode flag `/.../u` enables the correct support of surrogate pairs. Surrogate pairs are explained in the chapter . -Let's briefly remind them here. In short, normally characters are encoded with 2 bytes. That gives us 65536 characters maximum. But there are more characters in the world. +Let's briefly review them here. In short, normally characters are encoded with 2 bytes. That gives us 65536 characters maximum. But there are more characters in the world. So certain rare characters are encoded with 4 bytes, like `𝒳` (mathematical X) or `😄` (a smile).