diff --git a/1-js/05-data-types/03-string/article.md b/1-js/05-data-types/03-string/article.md index 2b45f0b7..c331ccb6 100644 --- a/1-js/05-data-types/03-string/article.md +++ b/1-js/05-data-types/03-string/article.md @@ -558,7 +558,7 @@ You can skip the section if you don't plan to support them. ### Surrogate pairs -Most symbols have a 2-byte code. Letters in most european languages, numbers, and even most hieroglyphs, have a 2-byte representation. +All frequently used characters have 2-byte codes. Letters in most european languages, numbers, and even most hieroglyphs, have a 2-byte representation. But 2 bytes only allow 65536 combinations and that's not enough for every possible symbol. So rare symbols are encoded with a pair of 2-byte characters called "a surrogate pair". @@ -628,7 +628,7 @@ For instance: ```js run alert( 'S\u0307\u0323' ); // Ṩ, S + dot above + dot below -alert( 'S\u0323\u0307' ); // Ṩ, S + dot below + dot above +alert( 'S\u0323\u0307' ); // Ṩ, S + dot below + dot above alert( 'S\u0307\u0323' == 'S\u0323\u0307' ); // false ``` @@ -649,7 +649,7 @@ alert( "S\u0307\u0323".normalize().length ); // 1 alert( "S\u0307\u0323".normalize() == "\u1e68" ); // true ``` -In reality, this is not always the case. The reason being that the symbol `Ṩ` is "common enough", so UTF-16 creators included it in the main table and gave it the code. +In reality, this is not always the case. The reason being that the symbol `Ṩ` is "common enough", so UTF-16 creators included it in the main table and gave it the code. If you want to learn more about normalization rules and variants -- they are described in the appendix of the Unicode standard: [Unicode Normalization Forms](http://www.unicode.org/reports/tr15/), but for most practical purposes the information from this section is enough.