From 0a7ded59b870fe5ebdcefc93e177547476f22294 Mon Sep 17 00:00:00 2001 From: aruseni Date: Tue, 16 Jul 2019 11:35:34 +0300 Subject: [PATCH] [strings] Surrogate pairs S example --- 1-js/05-data-types/03-string/article.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/1-js/05-data-types/03-string/article.md b/1-js/05-data-types/03-string/article.md index b1f7d5b5..35f44a41 100644 --- a/1-js/05-data-types/03-string/article.md +++ b/1-js/05-data-types/03-string/article.md @@ -631,10 +631,12 @@ This provides great flexibility, but also an interesting problem: two characters For instance: ```js run -alert( 'S\u0307\u0323' ); // Ṩ, S + dot above + dot below -alert( 'S\u0323\u0307' ); // Ṩ, S + dot below + dot above +let s1 = 'S\u0307\u0323'; // Ṩ, S + dot above + dot below +let s2 = 'S\u0323\u0307'; // Ṩ, S + dot below + dot above -alert( 'S\u0307\u0323' == 'S\u0323\u0307' ); // false, different characters (?!) +alert( `s1: ${s1}, s2: ${s2}` ); + +alert( s1 == s2 ); // false though the characters look identical (?!) ``` To solve this, there exists a "unicode normalization" algorithm that brings each string to the single "normal" form.