diff --git a/2-ui/5-data-storage/01-cookie/article.md b/2-ui/5-data-storage/01-cookie/article.md new file mode 100644 index 00000000..56d80eae --- /dev/null +++ b/2-ui/5-data-storage/01-cookie/article.md @@ -0,0 +1,399 @@ +# Cookies, document.cookie + +Cookies allow to store small pieces of data directly in the browser. They are not part of Javascript, but rather part of HTTP, defined by [RFC 6265](https://tools.ietf.org/html/rfc6265) specification. + +Most of the time, cookies are set by a webserver, but Javascript can access them too. + +One of the most widespread use of cookies is authentication: + +1. Upon sign in, the server sets `Set-Cookie` HTTP-header with a cookie with "session id". +2. The browser stores it. +3. The browser sends it over the net in `Cookie` HTTP-header for every request to the domain that set it. So the server knows who made the request. + +The browser provides a special accessor `document.cookie` for cookies. + +There are many tricky things about cookies and their options, how to set them right. In this chapter we'll cover them in detail. + +## Reading from document.cookie + +```online +Do you have any cookies on this site? Let's see: +``` + +```offline +Assuming you're on a website, it's possible to see the cookies, like this: +``` + +```js run +// At javascript.info, we use Google Analytics for statistics, +// so there should be some cookies from there. +alert( document.cookie ); // cookie1=value1; cookie2=value2;... +``` + + +The string consist of `name=value` pairs, delimited by `; `. So, to find a particular cookie, we can split `document.cookie` by `; `, and then find the right key. We can use either a regular expression or array functions to do that. At the end of the chapter you'll find a few functions to manipulate cookies. + +## Writing to document.cookie + +The `document.cookie` is writable. But it's not a data property, but rather an accessor. + +**A write operation to `document.cookie` passes through the browser that updates cookies mentioned in it, but doesn't touch other cookies.** + +For instance, this call sets a cookie with the name `user` and value `John`: + +```js run +document.cookie = "user=John"; +alert(document.cookie); +``` + +If you run it, then probably you'll see multiple cookies. Only the cookie named `user` was altered. + +Technically, name and value can have any characters, but then they should be escaped using a built-in `encodeURIComponent` function: + +```js run +let name = "<>"; +let value = "=" +// encodes the cookie as %3C%3E=%3D +document.cookie = encodeURIComponent(name) + '=' + encodeURIComponent(value); +alert(document.cookie); // ...; %3C%3E=%3D +``` + + +```warn header="Limitations" +There are few limitations: +- The `name=value` pair together (after `encodeURIComponent`) should not exceed 4kb. So we can't store anything huge in a cookie. +- The total number of cookies per domain is limited to 30-50, depending on a browser. +``` + +Cookies have several options, many of them are important and should be set. + +The options are listed after `key=value`, delimited by `;`, for instance: + +```js run +document.cookie = "user=John; path=/; expires=Tue, 19 Jan 2038 03:14:07 GMT" +``` + +## path + +- **`path=/mypath`** + +The url path prefix, where the cookie is accessible. By default, it's the current path. + +If a cookie is set with `path=/mypath`, it's visible at `/mypath` and `/mypath/page`, but not at `/page` or `/mypathpage`. + +Usually, we set `path=/` to make the cookie accessible from all website pages. + +Please note: the path must be absolute (start with `/`). + +## domain + +- **`domain=site.com`** + +Domain where the cookie is accessible. + +By default, cookie is accessible only at the domain that set it. So, if we set a cookie at `site.com`, we won't get it `other.com`. That's natural, as `other.com` is another site. + +What's more tricky, we won't get it at a subdomain `forum.site.com`: + +```js +// at site.com +document.cookie = "user=John" + +// at forum.site.com +alert(document.cookie); // no user +``` + +**There's no way to let a cookie be accessible from another 2nd-level domain, so `other.com` will never receive a cookie set at `site.com`.** + +It's a safety restriction, to allow us to store sensitive data in cookies. + +For subdomains like `forum.site.com` that's possible. If we'd like a subdomain to access the cookie, we should set the `domain` to it. Or, much more common that we'd like any subdomain `*.site.com` to access the cookie, then we should set `domain=site.com`: + +```js +// at site.com, make the cookie accessible on any subdomain: +document.cookie = "user=John; domain=site.com" + +// at forum.site.com +alert(document.cookie); // with user +``` + +For historical reasons, `domain=.site.com` (a dot at the start) also works this way. + +## expires, max-age + +By default, if a cookie doesn't have one of these options, it disappears when the browser is closed. Such cookies are called "session cookies" + +To let cookies survive browser close, we can set either `expires` or `max-age` option. + +- **`expires=Tue, 19 Jan 2038 03:14:07 GMT`** + +Cookie expiration date, when the browser will delete it automatically. + +The date must be exactly in this format, in GMT timezone. We can use `date.toUTCString` to get it. For instance, we can set the cookie to expire in 1 day: + +```js +// +1 day from now +let date = new Date(Date.now() + 86400e3); +date = date.toUTCString(); +document.cookie = "user=John; expires=" + date; +``` + +If the date is in the past, the cookie will be deleted from the browser. + +- **`max-age=3600`** + +An alternative to `expires`, specifies the cookie expiration in seconds. + +Can be either a number of seconds from the current moment, or zero/negative for immediate expiration (to remove the cookie): + +```js +// cookie will die +1 hour from now +document.cookie = "user=John; max-age=3600"; + +// delete cookie (let it expire right now) +document.cookie = "user=John; max-age=0"; +``` + +## secure + +- **`secure`** + +The cookie should be transferred only over HTTPS. + +**By default if we set a cookie at `http://site.com`, then it also appears at `https://site.com` and vise versa.** + +That is, cookies only check the domain, they do not distinguish between the protocols. + +With this option, if a cookie is set while `https://site.com`, then it doesn't appear when the same site is accessed by HTTP, as `http://site.com`. So if a cookie has sensitive content that should never be sent over unencrypted HTTP, then the flag can prevent this. + +```js +// set the cookie secure (only accessible if over HTTPS) +document.cookie = "user=John; secure"; +``` + +## samesite + +That's another security option, to protect from so-called XSRF (cross-site request forgery) attacks. + +To understand when it's useful, let's introduce the following attack scenario. + +### XSRF attack + +Imagine, you are logged into the site `bank.com`. That is: you have an authentication cookie from that site. Your browser sends it to `bank.com` on every request, so that it recognizes you and performs all sensitive financial operations. + +Now, while browsing the web in another window, you occasionally come to another site `evil.com`, that has a `
` with hacker's account and JavaScript code that sends it automatically. + +The form is submitted to the bank site, and your cookie is also sent, just because it's sent every time you visit `bank.com`. So the bank recognizes you and actually performs the payment. + +![](cookie-xsrf.png) + +That's called a cross-site request forgery (or XSRF) attack. + +Real banks are protected from it of course. All forms generated by `bank.com` have a special field, so called "xsrf protection token", that the evil page can't generate. + +### Enter cookie samesite option + +Now, cookie samesite option provides another way to protect from such attacks, that (in theory) should not require "xsrf protection tokens". + +It has two possible values: + +- **`samesite=strict`, same as `samesite` without value** + +A cookie with `samesite=strict` is never sent if the user comes from outside the site. + +In other words, whether a user follows a link from the mail or submits a form from `evil.com`, for any operation that comes from another domain, the cookie is not sent. Then the XSRF attack will fail, as `bank.com` will not recognize the user without the cookie, and will not proceed with the payment. + +The protection is quite reliable. Only operations originating from `bank.com` will send cookies. + +Although, there's a small inconvenience. + +When a user follows a legitimate link to `bank.com`, like from their own notes, they'll be surprised that `bank.com` does not recognize them. Indeed, `samesite=strict` cookies are not sent in that case. + +We could work around that by using two cookies: one for "general recognition", only for the purposes of saying: "Hello, John", and the other one for data-changing operations with `samesite=strict`. + +Then a person coming from outside of the site will see a welcome, but payments must be initiated from the bank website. + +- **`samesite=lax`** + +Another approach to keep user experience is to use `samesite=lax`, a more relaxed value. + +Lax mode, just like `strict`, forbids the browser to send cookies when coming from outside the site, but adds an exception. + +A `samesite=lax` cookie is sent if both of these conditions are true: +1. The HTTP method is "safe" (e.g. GET, but not POST). + + The full list safe of HTTP methods is in the [RFC7231 specification](https://tools.ietf.org/html/rfc7231). Basically, these are the methods that should be used for reading, but not writing the data. They must not perform any data-changing operations. Following a link is always GET, the safe method. + +2. The operation performs top-level navigation (changes URL in the browser address bar). + + That's usually true, but if the navigation is performed in an `