493 lines
18 KiB
Markdown
493 lines
18 KiB
Markdown
# Node properties: type, tag and contents
|
|
|
|
Let's get a more in-depth look at DOM nodes.
|
|
|
|
In this chapter we'll see more into what they are and their most used properties.
|
|
|
|
[cut]
|
|
|
|
## DOM node classes
|
|
|
|
DOM nodes have different properties depending on their class. For instance, element nodes corresponding to tag `<a>` have link-related properties, and those corresponding to `<input>` have input-related and so on.
|
|
|
|
Text nodes are not the same as element nodes: they have properties of their own.
|
|
|
|
But there are also common properties and methods between all of them, because of the inheritance. Each DOM node belongs to the corresponding built-in class. These classes form a hierarchy.
|
|
|
|
The root is [EventTarget](https://dom.spec.whatwg.org/#eventtarget), that is inherited by [Node](http://dom.spec.whatwg.org/#interface-node), and other DOM nodes inherit from it.
|
|
|
|
Here's the picture, explanations to follow:
|
|
|
|

|
|
|
|
The classes are:
|
|
|
|
- [EventTarget](https://dom.spec.whatwg.org/#eventtarget) is the root "abstract" class. Objects of that class are never created. It serves as a base, so that all DOM nodes support so-called "events", we'll study them later.
|
|
- [Node](http://dom.spec.whatwg.org/#interface-node) is also an "abstract" class, serving as a base for DOM nodes. It provides the core tree functionality: `parentNode`, `nextSibling`, `childNodes` and so on (they are getters). Objects of `Node` class are never created. But there are concrete node classes that inherit from it, namely: `Text` for text nodes, `Element` for element nodes and more exotic ones like `Comment` for comment nodes.
|
|
- [Element](http://dom.spec.whatwg.org/#interface-element) is a base class for DOM elements. It provides element-level navigation like `nextElementSibling`, `children` and searching methods like `getElementsByTagName`, `querySelector`. In the browser there may be not only HTML, but also XML and SVG documents. The `Element` class serves as a base for more specific classes: `SVGElement`, `XMLElement` and `HTMLElement`.
|
|
- [HTMLElement](https://html.spec.whatwg.org/multipage/dom.html#htmlelement) is finally the basic class for all HTML elements. It is inherited by various HTML elements:
|
|
- [HTMLInputElement](https://html.spec.whatwg.org/multipage/forms.html#htmlinputelement) -- the class for `<input>` elements,
|
|
- [HTMLBodyElement](https://html.spec.whatwg.org/multipage/semantics.html#htmlbodyelement) -- the class for `<body>` elements,
|
|
- [HTMLAnchorElement](https://html.spec.whatwg.org/multipage/semantics.html#htmlanchorelement) -- the class for `<a>` elements
|
|
- ...and so on, each tag has its own class that may provide specific properties and methods.
|
|
|
|
So, the full set of properties and methods of a given node comes as the result of the inheritance.
|
|
|
|
For instance, `<input>` element is of the [HTMLInputElement](https://html.spec.whatwg.org/multipage/forms.html#htmlinputelement) class:
|
|
- `HTMLInputElement` itself provides input-specific properties.
|
|
- It inherits the common HTML element methods (and getters/setters) from `HTMLElement` class.
|
|
- Then it supports common element methods, because `HTMLElement` inherits from `Element`.
|
|
- Then it supports common DOM node properties, because it inherits from `Node`.
|
|
- Finally, it supports events (to be covered), because it inherits from `EventTarget`.
|
|
- (and for completeness: `EventTarget` inherits from `Object`)
|
|
|
|
To see the DOM node class name, we can remember that an object usually has the `constructor` property. It references to the class constructor, so the `constructor.name` is what we need:
|
|
|
|
```js run
|
|
alert( document.body.constructor.name ); // HTMLBodyElement
|
|
```
|
|
|
|
...Or we can just `toString` it:
|
|
|
|
```js run
|
|
alert( document.body ); // [object HTMLBodyElement]
|
|
```
|
|
|
|
We also can use `instanceof` to check the inheritance chain:
|
|
|
|
```js run
|
|
alert( document.body instanceof HTMLBodyElement ); // true
|
|
alert( document.body instanceof HTMLElement ); // true
|
|
alert( document.body instanceof Element ); // true
|
|
alert( document.body instanceof Node ); // true
|
|
alert( document.body instanceof EventTarget ); // true
|
|
```
|
|
|
|
As we can see, DOM nodes are regular JavaScript objects. They use prototype-based classes for inheritance. That's easy to see by outputting an element with `console.dir(elem)`. There you can see `HTMLElement.prototype`, `Element.prototype` and so on.
|
|
|
|
```smart header="`console.dir(elem)` versus `console.log(elem)`"
|
|
Most browsers support two commands in their developer tools: `console.log` and `console.dir`. They output their arguments to the console. For JavaScript objects these commands usually do the same.
|
|
|
|
But for DOM elements they are different:
|
|
|
|
- `console.log(elem)` shows the element DOM tree.
|
|
- `console.dir(elem)` shows the element as a DOM object, good to explore its properties.
|
|
|
|
Try it on `document.body`.
|
|
```
|
|
|
|
````smart header="IDL in the spec"
|
|
In the specification classes are described using not JavaScript, but a special [Interface description language](https://en.wikipedia.org/wiki/Interface_description_language) (IDL), that is usually easy to understand.
|
|
|
|
The most important difference is that all properties are given with their types. For instance, `DOMString`, `boolean` and so on.
|
|
|
|
Here's an excerpt from it, with comments:
|
|
|
|
```js
|
|
// Define HTMLInputElement
|
|
*!*
|
|
// The colon means that it inherits from HTMLElement
|
|
*/!*
|
|
interface HTMLInputElement: HTMLElement {
|
|
|
|
*!*
|
|
// "DOMString" means that the property is a string
|
|
*/!*
|
|
attribute DOMString accept;
|
|
attribute DOMString alt;
|
|
attribute DOMString autocomplete;
|
|
attribute DOMString value;
|
|
|
|
*!*
|
|
// boolean property (true/false)
|
|
attribute boolean autofocus;
|
|
*/!*
|
|
...
|
|
*!*
|
|
// now the method: "void" means that that returns no value
|
|
*/!*
|
|
void select();
|
|
...
|
|
}
|
|
```
|
|
|
|
Other classes are somewhat similar.
|
|
````
|
|
|
|
## The "nodeType" property
|
|
|
|
The `nodeType` property provides an old-fashioned way to get the "type" of a DOM node.
|
|
|
|
It has a numeric value:
|
|
- `elem.nodeType == 1` for element nodes,
|
|
- `elem.nodeType == 3` for text nodes,
|
|
- `elem.nodeType == 9` for the document object,
|
|
- there are few other values in [the specification](https://dom.spec.whatwg.org/#node).
|
|
|
|
For instance:
|
|
|
|
```html run
|
|
<body>
|
|
<script>
|
|
let elem = document.body;
|
|
|
|
// let's examine what it is?
|
|
alert(elem.nodeType); // 1 => element
|
|
|
|
// and the first child is...
|
|
alert(elem.firstChild.nodeType); // 3 => text
|
|
|
|
// for the document object, the type is 9
|
|
alert( document.nodeType ); // 9
|
|
</script>
|
|
</body>
|
|
```
|
|
|
|
In modern scripts, we can use `instanceof` and other class-based tests to see the node type, but sometimes `nodeType` may be simpler. We can only read `nodeType`, not change it.
|
|
|
|
## Tag: nodeName and tagName
|
|
|
|
Given a DOM node, we can read its tag name from `nodeName` or `tagName` properties:
|
|
|
|
For instance:
|
|
|
|
```js run
|
|
alert( document.body.nodeName ); // BODY
|
|
alert( document.body.tagName ); // BODY
|
|
```
|
|
|
|
Is there any difference between tagName and nodeName?
|
|
|
|
Actually, yes, the difference is reflected in their names, but is indeed a bit subtle.
|
|
|
|
- The `tagName` property exists only for `Element` nodes.
|
|
- The `nodeName` is defined for any `Node`:
|
|
- for elements it means the same as `tagName`.
|
|
- for other node types (text, comment etc) it has a string with the node type.
|
|
|
|
So `tagName` can only be used for elements, while `nodeName` can say something about other node types.
|
|
|
|
For instance let's compare `tagName` and `nodeName` for the `document` and a comment node:
|
|
|
|
|
|
```html run
|
|
<body><!-- comment -->
|
|
|
|
<script>
|
|
// for comment
|
|
alert( document.body.firstChild.tagName ); // undefined (not element)
|
|
alert( document.body.firstChild.nodeName ); // #comment
|
|
|
|
// for document
|
|
alert( document.tagName ); // undefined (not element)
|
|
alert( document.nodeName ); // #document
|
|
</script>
|
|
</body>
|
|
```
|
|
|
|
If we only deal with elements, then `tagName` is the only thing we should use.
|
|
|
|
|
|
```smart header="The tag name is always uppercase except XHTML"
|
|
The browser has two modes of processing documents: HTML and XML. Usually the HTML-mode is used for webpages. XML-mode is enabled when the browser receives an XML-document with the header: `Content-Type: application/xml+xhtml`.
|
|
|
|
In HTML mode `tagName/nodeName` is always uppercased: it's `BODY` either for `<body>` or `<BoDy>`.
|
|
|
|
In XML mode the case is kept "as is", but it's rarely used.
|
|
```
|
|
|
|
|
|
## innerHTML: the contents
|
|
|
|
The [innerHTML](https://w3c.github.io/DOM-Parsing/#widl-Element-innerHTML) property allows to get the HTML inside the element as a string.
|
|
|
|
We can also modify it. So it's one of most powerful ways to change the page.
|
|
|
|
The example shows the contents of `document.body` and then replaces it completely:
|
|
|
|
```html run
|
|
<body>
|
|
<p>A paragraph</p>
|
|
<div>A div</div>
|
|
|
|
<script>
|
|
alert( document.body.innerHTML ); // read the current contents
|
|
document.body.innerHTML = 'The new BODY!'; // replace it
|
|
</script>
|
|
|
|
</body>
|
|
```
|
|
|
|
We can try to insert an invalid HTML, the browser will fix our errors:
|
|
|
|
```html run
|
|
<body>
|
|
|
|
<script>
|
|
document.body.innerHTML = '<b>test'; // forgot to close the tag
|
|
alert( document.body.innerHTML ); // <b>test</b> (fixed)
|
|
</script>
|
|
|
|
</body>
|
|
```
|
|
|
|
```smart header="Scripts don't execute"
|
|
If `innerHTML` inserts a `<script>` tag into the document -- it doesn't execute.
|
|
|
|
It becomes a part of HTML, just as a script that has already run.
|
|
```
|
|
|
|
### Beware: "innerHTML+=" does a full overwrite
|
|
|
|
We can append "more HTML" by using `elem.innerHTML+="something"`.
|
|
|
|
Like this:
|
|
|
|
```js
|
|
chatDiv.innerHTML += "<div>Hello<img src='smile.gif'/> !</div>";
|
|
chatDiv.innerHTML += "How goes?";
|
|
```
|
|
|
|
But we should be very careful about doing it, because what's going on is *not* an addition, but a full overwrite.
|
|
|
|
Technically, these two lines do the same:
|
|
|
|
```js
|
|
elem.innerHTML += "...";
|
|
// is a shorter way to write:
|
|
*!*
|
|
elem.innerHTML = elem.HTML + "..."
|
|
*/!*
|
|
```
|
|
|
|
In other words, `innerHTML+=` does this:
|
|
|
|
1. The old contents is removed.
|
|
2. The new `innerHTML` is written instead (a concatenation of the old and the new one).
|
|
|
|
**As the content is "zeroed-out" and rewritten from the scratch, all images and other resources will be reloaded**.
|
|
|
|
In the `chatDiv` example above the line `chatDiv.innerHTML+="How goes?"` re-creates the HTML content and reloads `smile.gif` (hope it's cached). If `chatDiv` has a lot of other text and images, then the reload becomes clearly visible.
|
|
|
|
There are other side-effects as well. For instance, if the existing text was selected with the mouse, then most browsers will remove the selection upon rewriting `innerHTML`. And if there was an `<input>` with a text entered by the visitor, then the text will be removed. And so on.
|
|
|
|
Luckily, there are other ways to add HTML besides `innerHTML`, and we'll study them soon.
|
|
|
|
## outerHTML: full HTML of the element
|
|
|
|
The `outerHTML` property contains the full HTML of the element. That's like `innerHTML` plus the element itself.
|
|
|
|
Here's an example:
|
|
|
|
```html run
|
|
<div id="elem">Hello <b>World</b></div>
|
|
|
|
<script>
|
|
alert(elem.outerHTML); // <div id="elem">Hello <b>World</b></div>
|
|
</script>
|
|
```
|
|
|
|
Unlike `innerHTML`, writing to `outerHTML` does not change the element. Instead, it replaces it as a whole in the outer context. Yeah, sounds strange, and strange it is. Take a look.
|
|
|
|
Consider the example:
|
|
|
|
```html run
|
|
<div>Hello, world!</div>
|
|
|
|
<script>
|
|
let div = document.querySelector('div');
|
|
|
|
*!*
|
|
// replace div.outerHTML with <p>...</p>
|
|
*/!*
|
|
div.outerHTML = '<p>A new element!</p>'; // (*)
|
|
|
|
*!*
|
|
// Wow! The div is still the same!
|
|
*/!*
|
|
alert(div.outerHTML); // <div>Hello, world!</div>
|
|
</script>
|
|
```
|
|
|
|
In the line `(*)` we take the full HTML of `<div>...</div>` and replace it by `<p>...</p>`. In the outer document we can see the new content instead of the `<div>`. But the old `div` variable is still the same.
|
|
|
|
The `outerHTML` assignment does not modify the DOM element, but extracts it from the outer context and inserts a new piece of HTML instead of it.
|
|
|
|
Novice developers sometimes make an error here: they modify `div.outerHTML` and then continue to work with `div` as if it had the new content in it.
|
|
|
|
That's possible with `innerHTML`, but not with `outerHTML`.
|
|
|
|
We can write to `outerHTML`, but should keep in mind that it doesn't change the element we're writing to. It creates the new content on its place instead. We can get a reference to new elements by querying DOM.
|
|
|
|
## nodeValue/data: text node content
|
|
|
|
The `innerHTML` property is only valid for element nodes.
|
|
|
|
Other node types have their counterpart: `nodeValue` and `data` properties. These two are almost the same for practical use, there are only minor specification differences. So we'll use `data`, because it's shorter.
|
|
|
|
We can read it, like this:
|
|
|
|
```html run height="50"
|
|
<body>
|
|
Hello
|
|
<!-- Comment -->
|
|
<script>
|
|
let text = document.body.firstChild;
|
|
*!*
|
|
alert(text.data); // Hello
|
|
*/!*
|
|
|
|
let comment = text.nextSibling;
|
|
*!*
|
|
alert(comment.data); // Comment
|
|
*/!*
|
|
</script>
|
|
</body>
|
|
```
|
|
|
|
For text nodes we can imagine a reason to read or modify them, but why comments? Usually, they are not interesting at all, but sometimes developers embed information into HTML in them, like this:
|
|
|
|
```html
|
|
<!-- if isAdmin -->
|
|
<div>Welcome, Admin!</div>
|
|
<!-- /if -->
|
|
```
|
|
|
|
...Then JavaScript can read it and process embedded instructions.
|
|
|
|
## textContent: pure text
|
|
|
|
The `textContent` provides access to the *text* inside the element: only text, minus all `<tags>`.
|
|
|
|
For instance:
|
|
|
|
```html run
|
|
<div id="news">
|
|
<h1>Headline!</h1>
|
|
<p>Martians attack people!</p>
|
|
</div>
|
|
|
|
<script>
|
|
// Headline! Martians attack people!
|
|
alert(news.textContent);
|
|
</script>
|
|
```
|
|
|
|
As we can see, only text is returned, as if all `<tags>` were cut out, but the text in them remained.
|
|
|
|
In practice, reading such text is rarely needed.
|
|
|
|
**Writing to `textContent` is much more useful, because it allows to write text the "safe way".**
|
|
|
|
Let's say we have an arbitrary string, for instance entered by a user, and want to show it.
|
|
|
|
- With `innerHTML` we'll have it inserted "as HTML", with all HTML tags.
|
|
- With `textContent` we'll have it inserted "as text", all symbols are treated literally.
|
|
|
|
Compare the two:
|
|
|
|
```html run
|
|
<div id="elem1"></div>
|
|
<div id="elem2"></div>
|
|
|
|
<script>
|
|
let name = prompt("What's your name?", "<b>Winnie-the-pooh!</b>");
|
|
|
|
elem1.innerHTML = name;
|
|
elem2.textContent = name;
|
|
</script>
|
|
```
|
|
|
|
1. The first `<div>` gets the name "as HTML": all tags become tags, so we see the bold name.
|
|
2. The second `<div>` gets the name "as text", so we literally see `<b>Winnie-the-pooh!</b>`.
|
|
|
|
In most cases, we expect the text from a user, and want to treat it as text. We don't want unexpected HTML in our site. An assignment to `textContent` does exactly that.
|
|
|
|
## The "hidden" property
|
|
|
|
The "hidden" attribute and the DOM property specifies whether the element is visible or not.
|
|
|
|
We can use it in HTML or assign using JavaScript, like this:
|
|
|
|
```html run height="80"
|
|
<div>Both divs below are hidden</div>
|
|
|
|
<div hidden>With the attribute "hidden"</div>
|
|
|
|
<div id="elem">JavaScript assigned the property "hidden"</div>
|
|
|
|
<script>
|
|
elem.hidden = true;
|
|
</script>
|
|
```
|
|
|
|
Technically, `hidden` works the same as `style="display:none"`. But it's shorter to write.
|
|
|
|
Here's a blinking element:
|
|
|
|
|
|
```html run height=50
|
|
<div id="elem">A blinking element</div>
|
|
|
|
<script>
|
|
setInterval(() => elem.hidden = !elem.hidden, 1000);
|
|
</script>
|
|
```
|
|
|
|
## More properties
|
|
|
|
DOM elements also have additional properties, many of them provided by the class:
|
|
|
|
- `value` -- the value for `<input>`, `<select>` and `<textarea>` (`HTMLInputElement`, `HTMLSelectElement`...).
|
|
- `href` -- the "href" for `<a href="...">` (`HTMLAnchorElement`).
|
|
- `id` -- the value of "id" attribute, for all elements (`HTMLElement`).
|
|
- ...and much more...
|
|
|
|
For instance:
|
|
|
|
```html run height="80"
|
|
<input type="text" id="elem" value="value">
|
|
|
|
<script>
|
|
alert(elem.type); // "text"
|
|
alert(elem.id); // "elem"
|
|
alert(elem.value); // value
|
|
</script>
|
|
```
|
|
|
|
Most standard HTML attributes have the corresponding DOM property, and we can access it like that.
|
|
|
|
If we want to know the full list of supported properties for a given class, we can find them in the specification. For instance, HTMLInputElement is documented at <https://html.spec.whatwg.org/#htmlinputelement>.
|
|
|
|
Or if we'd like to get them fast or interested in the concrete browser -- we can always output the element using `console.dir(elem)` and read the properties. Or explore "DOM properties" in Elements tab of the browser developer tools.
|
|
|
|
## Summary
|
|
|
|
Each DOM node belongs to a certain class. The classes form a hierarchy. The full set of properties and methods comes as the result of inheritance.
|
|
|
|
Main DOM node properties are:
|
|
|
|
`nodeType`
|
|
: Node type. We can get it from the DOM object class, but often we need just to see is it a text or element node. The `nodeType` property is good for that. It has numeric values, most important are: `1` -- for elements,`3` -- for text nodes. Read-only.
|
|
|
|
`nodeName/tagName`
|
|
: For elements, tag name (uppercased unless XML-mode). For non-element nodes `nodeName` describes what is it. Read-only.
|
|
|
|
`innerHTML`
|
|
: The HTML content of the element. Can modify.
|
|
|
|
`outerHTML`
|
|
: The full HTML of the element. A write operation into `elem.outerHTML` does not touch `elem` itself. Instead it gets replaced with the new HTML in the outer context.
|
|
|
|
`nodeValue/data`
|
|
: The content of a non-element node (text, comment). These two are almost the same, usually we use `data`. Can modify.
|
|
|
|
`textContent`
|
|
: The text inside the element, basically HTML minus all `<tags>`. Writing into it puts the text inside the element, with all special characters and tags treated exactly as text. Can safely insert user-generated text and protect from unwanted HTML insertions.
|
|
|
|
`hidden`
|
|
: When set to `true`, does the same as CSS `display:none`.
|
|
|
|
DOM nodes also have other properties depending on their class. For instance, `<input>` elements (`HTMLInputElement`) support `value`, `type`, while `<a>` elements (`HTMLAnchorElement`) support `href` etc. Most standard HTML attributes have the corresponding DOM property.
|
|
|
|
But HTML attributes and DOM properties are not always the same, as we'll see in the next chapter.
|