components
This commit is contained in:
parent
304d578b54
commit
6fb4aabcba
344 changed files with 669 additions and 406 deletions
76
4-binary/02-text-decoder/article.md
Normal file
76
4-binary/02-text-decoder/article.md
Normal file
|
@ -0,0 +1,76 @@
|
|||
# TextDecoder and TextEncoder
|
||||
|
||||
What if the binary data is actually a string? For instance, we received a file with textual data.
|
||||
|
||||
The build-in [TextDecoder](https://encoding.spec.whatwg.org/#interface-textdecoder) object allows to read the value into an an actual Javascript string, given the buffer and the encoding.
|
||||
|
||||
We first need to create it:
|
||||
```js
|
||||
let decoder = new TextDecoder([label], [options]);
|
||||
```
|
||||
|
||||
- **`label`** -- the encoding, `utf-8` by default, but `big5`, `windows-1251` and many other are also supported.
|
||||
- **`options`** -- optional object:
|
||||
- **`fatal`** -- boolean, if `true` then throw an exception for invalid (non-decodable) characters, otherwise (default) replace them with character `\uFFFD`.
|
||||
- **`ignoreBOM`** -- boolean, if `true` then ignore BOM (an optional byte-order unicode mark), rarely needed.
|
||||
|
||||
...And then decode:
|
||||
|
||||
```js
|
||||
let str = decoder.decode([input], [options]);
|
||||
```
|
||||
|
||||
- **`input`** -- `BufferSource` to decode.
|
||||
- **`options`** -- optional object:
|
||||
- **`stream`** -- true for decoding streams, when `decoder` is called repeatedly with incoming chunks of data. In that case a multi-byte character may occasionally split between chunks. This options tells `TextDecoder` to memorize "unfinished" characters and decode them when the next chunk comes.
|
||||
|
||||
For instance:
|
||||
|
||||
```js run
|
||||
let uint8Array = new Uint8Array([72, 101, 108, 108, 111]);
|
||||
|
||||
alert( new TextDecoder().decode(uint8Array) ); // Hello
|
||||
```
|
||||
|
||||
|
||||
```js run
|
||||
let uint8Array = new Uint8Array([228, 189, 160, 229, 165, 189]);
|
||||
|
||||
alert( new TextDecoder().decode(uint8Array) ); // 你好
|
||||
```
|
||||
|
||||
We can decode a part of the buffer by creating a subarray view for it:
|
||||
|
||||
|
||||
```js run
|
||||
let uint8Array = new Uint8Array([0, 72, 101, 108, 108, 111, 0]);
|
||||
|
||||
// the string is in the middle
|
||||
// create a new view over it, without copying anything
|
||||
let binaryString = uint8Array.subarray(1, -1);
|
||||
|
||||
alert( new TextDecoder().decode(binaryString) ); // Hello
|
||||
```
|
||||
|
||||
## TextEncoder
|
||||
|
||||
[TextEncoder](https://encoding.spec.whatwg.org/#interface-textencoder) does the reverse thing -- converts a string into bytes.
|
||||
|
||||
The syntax is:
|
||||
|
||||
```js run
|
||||
let encoder = new TextEncoder();
|
||||
```
|
||||
|
||||
The only encoding it supports is "utf-8".
|
||||
|
||||
It has two methods:
|
||||
- **`encode(str)`** -- returns `Uint8Array` from a string.
|
||||
- **`encodeInto(str, destination)`** -- encodes `str` into `destination` that must be `Uint8Array`.
|
||||
|
||||
```js run
|
||||
let encoder = new TextEncoder();
|
||||
|
||||
let uint8Array = encoder.encode("Hello");
|
||||
alert(uint8Array); // 72,101,108,108,111
|
||||
```
|
Loading…
Add table
Add a link
Reference in a new issue