© Copyright 2004, 2005 Apple Computer, Inc., Mozilla Foundation, and Opera Software ASA.
You are granted a license to use, reproduce and create derivative works of this document.
This specification introduces features to HTML and the DOM that ease the authoring of Web-based applications. Additions include the context menus, a direct-mode graphics canvas, inline popup windows, server-sent events, and more.
This is an archive copy of a working draft of Web Apps 1.0. It will be used as a milestone against which diffs can be generated, so that it is easier to track progress. Comments on this draft are very welcome, but it is suggested that you first check to see if the latest version has changed. If you do have comments, please send them to whatwg@whatwg.org. Thank you.
To find the latest version of this working draft, please follow the "Latest version" link above.
This draft may contain namespaces that use the uuid: URI
scheme. These are temporary and will be changed before those parts of the
specification are ready to be implemented in shipping products.
body element
section element
nav element
article element
blockquote element
aside element
h1, h2,
h3, h4, h5,
and h6 elements
header element
footer element
address element
a element
em element
strong element
small element
m element
abbr element
dfn element
i element
code element
var element
samp element
kbd element
sup and sub elements
q element
cite element
span element
bdo element
br element
datagrid element
DocumentWindow interface
Window interface
The World Wide Web's markup language has always been HTML. HTML was primarily designed as a language for semantically describing scientific documents, although its general design and adaptations over the years has enabled it to be used to describe a number of other types of documents.
The main area that has not been adequately addressed by HTML is a vague subject referred to as Web Applications. This specification attempts to rectify this, while at the same time updating the HTML specifications to address issues raised in the past few years.
This specification is limited to providing a semantic-level markup language and associated semantic-level scripting APIs for authoring accessible pages on the Web ranging from static documents to dynamic applications.
The scope of this specification does not include addressing presentation concerns.
The scope of this specification does not include documenting every HTML
or DOM feature supported by Web browsers. Browsers support many features
that are considered to be very bad for accessibility or that are otherwise
inappropriate. For example, the blink element is clearly
presentational and authors wishing to cause text to blink should instead
use CSS.
The scope of this specification is not to describe an entire operating system. In particular, office productivity applications, image manipulation, and other applications that users would be expected to use with high-end workstations on a daily basis are out of scope. In terms of applications, this specification is targetted specifically at applications that would be expected to be used by users on an occasional basis, or regularly but from disparate locations. For instance online purchasing systems, searching systems, games (especially multiplayer online games), public telephone books or address books, communications software (e-mail clients, instant messaging clients, discussion software), etc.
For sophisticated cross-platform applications, there already exist several proprietary solutions (such as Mozilla's XUL and Macromedia's Flash). These solutions are evolving faster than any standards process could follow, and the requirements are evolving even faster. These systems are also significantly more complicated to specify, and are orders of magnitude more difficult to achieve interoperability with, than the solutions described in this document. Platform-specific solutions for such sophisticated applications (for example the MacOS X Core APIs) are even further ahead.
This spec is probably big enough to need a guide as to where to look for various things. Hence once the structure is stable we should probably fill out this section.
This section will probably be dropped in due course.
HTML, CSS, DOM, and JavaScript provide enough power that Web developers have managed to base entire businesses on them. What is required are extensions to these technologies to provide much-needed features such as:
DOMActivate is a start, but it lacks equivalent HTML
attributes, and additional events may be needed.Some less important features would be good to have as well:
Several of the features in these two lists have been supported in non-standard ways by some user agents for some time.
This specification represents a new version of HTML4 and XHTML1, along with a new version of the associated DOM2 HTML API. Migration from HTML4 or XHTML1 to the format and APIs described in this specification should in most cases be straightforward, as care has been taken to ensure that backwards-compatibility is retained.
XHTML2 [XHTML2] defines a new HTML vocabulary with better features for hyperlinks, multimedia content, annotating document edits, rich metadata, declarative interactive forms, and describing the semantics of human literary works such as poems and scientific papers.
However, it lacks elements to express the semantics of many of the non-document types of content often seen on the Web. For instance, forum sites, auction sites, search engines, online shops, and the like, do not fit the document metaphor well, and are not covered by XHTML2.
This specification aims to extend HTML so that it is also suitable in these contexts.
XHTML2 and this specification use different namespaces and therefore can both be implemented in the same XML processor.
This specification is designed to complement Web Forms 2.0. [WF2] Where Web Forms concentrates on input controls, data validation, and form submission, this specification concentrates on client-side user interface features needed to create modern applications.
Eventually WF2 will simply be folded into this spec.
This specification is independent of the various proprietary UI languages that various vendors provide.
As well as sections marked as non-normative, all diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in the normative parts of this document are to be interpreted as described in [RFC2119]. For readability, these words do not appear in all uppercase letters in this specification.
This specification describes the conformance criteria for user agents (implementations and their implementors) and documents (and their authors).
Conformance requirements phrased as requirements on elements, attributes, methods or objects are conformance requirements on user agents.
User agents fall into several (overlapping) categories with different conformance requirements.
Web browsers that support XHTML must process elements and attributes from the XHTML namespace found in XML documents as described in this specification, so that users can interact with them, unless the semantics of those elements have been overridden by other specifications.
A conforming XHTML processor would, upon finding an
XHTML script element in an XML
document, execute the script contained in that element. However, if the
element is found within an XSLT transformation sheet (assuming the UA
also supports XSLT), then the processor would instead treat the script element as an opaque element that
forms part of the transform.
Web browsers that support HTML must
process documents labelled as text/html as described in
this specification, so that users can interact with them.
User agents that process HTML and XHTML documents purely to render non-interactive versions of them must comply to the same conformance criteria as Web browsers, except that they are exempt from requirements regarding user interaction.
Typical examples of non-interactive presentation user agents are printers (static UAs) and overhead displays (dynamic UAs). It is expected that most static non-interactive presentation user agents will also opt to lack scripting support.
A non-interactive but dynamic presentation UA would still execute scripts, allowing forms to be dynamically submitted, and so forth. However, since the concept of "focus" is irrelevant when the user cannot interact with the document, the UA would not need to support any of the focus-related DOM APIs.
Implementations that do not support scripting (or which have their scripting features disabled) are exempt from supporting the events and DOM interfaces mentioned in this specification. For the parts of this specification that are defined in terms of an events model or in terms of the DOM, such user agents must still act as if events and the DOM were supported.
Scripting can form an integral part of an application. Web browsers that do not support scripting, or that have scripting disabled, might be unable to fully convey the author's intent.
Conformance checkers must verify that a document conforms to the
applicable conformance criteria described in this specification.
Conformance checkers are exempt from detecting errors that require
interpretation of the author's intent (for example, while a document is
non-conforming if the content of a blockquote element is not a quote,
conformance checkers do not have to check that blockquote elements only contain quoted
material).
The term "validation" specifically refers to a subset of conformance checking that only verifies that a document complies with the requirements given by an SGML or XML DTD. Conformance checkers that only perform validation are non-conforming, as there are many conformance requirements described in this specification that cannot be checked by SGML or XML DTDs.
To put it another way, there are three types of conformance criteria:
A conformance checker must check for the first two. A simple DTD-based validator only checks for the first class of errors and is therefore not a conforming conformance checker according to this specification.
Applications and tools that process HTML and XHTML documents for reasons other than to either render the documents or check them for conformance should act in accordance to the semantics of the documents that they process.
A tool that generates document outlines but increases the nesting level for each paragraph and does not increase the nesting level for each section would not be conforming.
Authoring tools and markup generators must generate conforming documents. Conformance criteria that apply to authors also apply to authoring tools, where appropriate.
Conformance requirements phrased as algorithms or specific steps may be implemented in any manner, so long as the end result is equivalent. (In particular, the algorithms defined in this specification are intended to be easy to follow, and not intended to be performant.)
There is no implied relationship between document conformance requirements and implementation conformance requirements. User agents are not free to handle non-conformant documents as they please; the processing model described in this specification applies to implementations regardless of the conformity of the input documents.
For compatibility with existing content and prior specifications, this specification describes two authoring formats: one based on XML (referred to as XHTML), and one using a custom format inspired by SGML (referred to as HTML). Implementations may support only one of these two formats, although supporting both is encouraged.
XML documents using elements from the XHTML namespace that use the new
features described in this specification and that are served over the wire
(e.g. by HTTP) must be sent using an XML MIME type such as
application/xml or application/xhtml+xml and
must not be served as text/html. [RFC3023]
These XML documents may contain a DOCTYPE if desired, but
this is not required to conform to this specification.
HTML documents that use the new features described in this specification
and that are served over the wire (e.g. by HTTP) must be sent as
text/html and must start with the following DOCTYPE:
<!DOCTYPE html>.
This specification refers to both HTML and XML attributes and DOM attributes, often in the same context. When it is not clear which is being referred to, they are referred to as content attributes for HTML and XML attributes, and DOM attributes for those from the DOM. Similarly, the term "properties" is used for both ECMAScript object properties and CSS properties. When these are ambiguous they are qualified as object properties and CSS properties respectively.
To ease migration from HTML to XHTML, UAs conforming to this
specification must place elements in HTML in the
http://www.w3.org/1999/xhtml namespace, at least for the
purposes of the DOM and CSS. The term "elements in the
HTML namespace", when used in this specification, thus refers to
both HTML and XHTML elements.
Unless otherwise stated, all elements defined or mentioned in this
specification are in the http://www.w3.org/1999/xhtml
namespace, and all attributes defined or mentioned in this specification
have no namespace (they are in the per-element partition).
Generally, when the specification states that a feature applies to HTML or XHTML, it also includes the other. When a feature specifically only applies to one of the two languages, it is called out by explicitly stating that it does not apply to the other format, as in "for HTML, ... (this does not apply to XHTML)".
The readability, the term URI is used to refer to both ASCII URIs and Unicode IRIs, as those terms are defined by [RFC3986] and [RFC3987] respectively. On the rare occasions where IRIs are not allowed but ASCII URIs are, this is called out explicitly.
The term root element, when not qualified to explicitly refer to the document's root element, means the furthest ancestor element node of whatever node is being discussed, or the node itself is there is none. When the node is a part of the document, then that is indeed the document's root element. However, if the node is not currently part of the document tree, the root element will be an orphaned node.
When it is stated that some element or attribute is ignored, or treated as some other value, or handled as if it was something else, this refers only to the processing of the node after it is in the DOM. A user agent must not mutate the DOM in such situations.
When an XML name, such as an attribute or element name, is referred to
in the form prefix:localName, as in xml:id or
svg:rect, it refers to a name with the local name localName and the namespace given by the prefix, as defined
by the following table:
xml
http://www.w3.org/XML/1998/namespace
html
http://www.w3.org/1999/xhtml
For simplicity, terms such as shown, displayed, and visible might sometimes be used when referring to the way a document is rendered to the user. These terms are not meant to imply a visual medium; they must be considered to apply to other media in equivalent ways.
This specification uses the term HTML documents to generally refer to any use of HTML, ranging from short static documents to long essays or reports with rich multimedia, as well as to fully-fledged interactive applications.
Various DOM interfaces are defined in this specification using pseudo-IDL. This looks like OMG IDL but isn't. For instance, method overloading is used, and types from the W3C DOM specifications are used without qualification. Language-specific bindings for these abstract interface definitions must be derived in the way consistent with W3C DOM specifications. Some interface-specific binding information for ECMAScript is included in this specification.
The construction "a Foo object", where Foo is
actually an interface, is sometimes used instead of the more accurate "an
object implementing the interface Foo".
As the specification evolves, these conformance requirements will most likely be moved to more appropriate places.
When a UA needs to convert a string to a number, algorithms equivalent to those specified in ECMA262 sections 9.3.1 ("ToNumber Applied to the String Type") and 8.5 ("The Number type") should be used (possibly after suitably altering the algorithms to handle numbers of the range that the UA can support). [ECMA262]
The alt attribute on images must not be
shown in a tooltip in visual browsers.
DOM mutation events must not fire for
changes caused by the UA parsing the document. (Conceptually, the parser
is not mutating the DOM, it is constructing it.) This includes the parsing
of any content inserted using document.write() and
document.writeln() calls. Other changes, including fragment
insertions involving innerHTML and similar attributes, must
fire mutation events. [DOM3EVENTS]
The default value of
Content-Style-Type and the default value of the type attribute of the
style element is is
text/css.
The default value of
Content-Script-Type and the default value of the type attribute of the
script element is the ECMAScript MIME
type.
User agents must follow the rules given by XML Base to resolve relative URIs in HTML and XHTML fragments. [XMLBASE]
It is possible for xml:base attributes to be
present even in HTML fragments, as such attributes can be added
dynamically using script.
Elements, attributes, and attribute values in HTML are defined (by this
specification) to have certain meanings (semantics). For example, the
ol element represents an ordered list, and
the lang attribute
represents the language of the content.
Authors must only use elements, attributes, and attribute values for their appropriate semantic purposes.
For example, the following document is non-conforming, despite being syntactically correct:
<!DOCTYPE html>
<html lang="en-GB">
<head> <title> Demonstration </title> </head>
<body>
<table>
<tr> <td> My favourite animal is the cat. </td> </tr>
<tr>
<td>
—<a href="http://example.org/~ernest/"><cite>Ernest</cite></a>,
in an essay from 1992
</td>
</tr>
</table>
</body>
</html>
...because the data placed in the cells is clearly not tabular data. A corrected version of this document might be:
<!DOCTYPE html> <html lang="en-GB"> <head> <title> Demonstration </title> </head> <body> <blockquote> <p> My favourite animal is the cat. </p> </blockquote> <p> —<a href="http://example.org/~ernest/"><cite>Ernest</cite></a>, in an essay from 1992 </p> </body> </html>
This next document fragment, intended to represent the heading of a corporate site, is similarly non-conforming because the second line is not intended to be a heading of a subsection, but merely a subheading or subtitle (a subordinate heading for the same section).
<body> <h1>ABC Company</h1> <h2>Leading the way in widget design since 1432</h2> ...
The header element should be used
in these kinds of situations:
<body> <header> <h1>ABC Company</h1> <h2>Leading the way in widget design since 1432</h2> </header> ...
All the elements in this specification have a defined content model, which describes what nodes are allowed inside the elements, and thus what the structure of an HTML document or fragment must look like. Authors must only put elements inside an element if that element allows them to be there according to its content model.
For the purposes of determining if an element matches its content model or not, CDATA nodes in the DOM must be treated as text nodes, and character entity reference nodes must be treated as if they were expanded in place.
The whitespace characters U+0020 SPACE, U+000A LINE FEED, and U+000D CARRIAGE RETURN are always allowed between elements. User agents must always represent these characters between elements in the source markup as text nodes in the DOM. Empty text nodes and text nodes consisting of just sequences of those characters are considered inter-element whitespace and must be ignored when establishing whether an element matches its content model or not.
Authors must only use elements from the HTML namespace in the contexts where they are allowed, as defined for each element. For XML compound documents, these contexts could be inside elements from other namespaces, if those elements are defined as providing the relevant contexts.
The SVG specification defines the SVG foreignObject
element as allowing foreign namespaces to be included, thus allowing
compound documents to be created by inserting subdocument content under
that element. This specification defines the XHTML html element as being allowed where subdocument
fragments are allowed in a compound document. Together, these two
definitions mean that placing an XHTML html element as a child of an SVG
foreignObject element is conforming.
The Document Object Model (DOM) is a representation — a model — of the document and its content. [DOM3CORE] The DOM is not just an API; operations on the in-memory document are defined, in this specifiation, in terms of the DOM.
HTML elements in the DOM, including XHTML elements in XML documents, even when those documents are in another context (e.g. inside an XSLT transform), must implement, and expose to scripts, the interfaces listed for them in the relevant sections of this specification.
The basic interface, from which all the HTML elements' interfaces
inherit, and which is used by elements that have no additional
requirements, is the HTMLElement
interface (defined below).
To ease migration from HTML to XHTML, UAs must
assign the http://www.w3.org/1999/xhtml namespace to elements
in that are parsed in documents labelled as text/html, at
least for the purposes of the DOM and CSS.
In HTML documents, for HTML elements, the DOM APIs must return tag names and attributes names in uppercase, regardless of the case with which they were created. This does not apply to XML documents; in XML documents, the DOM APIs must always return tag names and attribute names in the original case used to create those nodes.
DOM3 Core defines mechanisms for checking for interface support, and for obtaining implementations of interfaces, using feature strings. [DOM3CORE]
A DOM application can use the hasFeature(feature, version) method of the
DOMImplementation interface with parameter values "HTML" and "5.0" (respectively) to determine
whether or not this module is supported by the implementation. In addition
to the feature string "HTML", the feature string
"XHTML" (with version string "5.0") can
be used to check if the implementation supports XHTML. User agents should
respond with a true value when the hasFeature method is queried with these
values. Authors are cautioned, however, that UAs returning true might not
be perfectly compliant, and that UAs returning false might well have
support for features in this specification; in general, therefore, use of
this method is discouraged.
The values "HTML" and "XHTML" (both with version "5.0") should also
be supported in the context of the getFeature() and
isSupported() methods, as defined by DOM3 Core.
The interfaces defined in this specification are not always
supersets of the interfaces defined in DOM2 HTML; some features that were
formerly deprecated, poorly supported, rarely used or considered
unnecessary have been removed. Therefore it is not guarenteed that an
implementation that supports "HTML"
"5.0" also supports "HTML"
"2.0".
Still need to define HTMLCollection.
interface DOMTokenString {
bool has(in DOMString token);
void add(in DOMString token);
void remove(in DOMString token);
}
Need to define those members.
Every XML and HTML document in an HTML UA must be represented by a
Document object. [DOM3CORE]
This object must also implement the document-level interface of any
other namespaces found in the document that the UA supports. For example,
if the implementation supports both HTML and SVG, then the
Document object must also implement HTMLDocument and SVGDocument.
The Document object of documents that are being rendered in
a browsing context must also
implement the DocumentWindow
interface.
interface HTMLDocument : Document { attribute DOMString title; readonly attribute DOMString referrer; readonly attribute DOMString domain; readonly attribute DOMString URL; attribute HTMLElement body; readonly attribute HTMLCollection images; readonly attribute HTMLCollection applets; readonly attribute HTMLCollection links; readonly attribute HTMLCollection forms; readonly attribute HTMLCollection anchors; attribute DOMString cookie; void open(); void close(); void write(in DOMString text); void writeln(in DOMString text); NodeList getElementsByName(in DOMString elementName); };
Need to define those members.
Some DOM attributes are defined to reflect a particular content attribute. This means that on getting, the DOM attribute returns the current value of the content attribute, and on setting, the DOM attribute changes the value of the content attribute to the given value.
If a reflecting DOM attribute is a DOMString attribute
defined to contain a URI, then on getting, the DOM attribute returns the
value of the content attribute, resolved to an absolute URI, and on
setting, sets the content attribute to the specified literal value. If the
content attribute is absent, the DOM attribute must return the default
value, if the content attribute has one, or else the empty string.
If a reflecting DOM attribute is a DOMString attribute that
is not defined to contain a URI, then the getting and setting is done in a
transparent, case-sensitive manner, except if the content attribute is
defined to only allow a specific set of values. In this latter case, the
attribute's value is first converted to lowercase before being returned.
If the content attribute is absent, the DOM attribute must return the
default value, if the content attribute has one, or else the empty string.
If a reflecting DOM attribute is a boolean attribute, then the DOM attribute returns true if the attribute is set, and false if it is absent. On setting, the content attribute is removed if the DOM attribute is set to false, and is set to have the same value as its name if the DOM attribute is set to true.
If a reflecting DOM attribute is a numeric type (long) then
the content attribute must be converted to a numeric
type first (truncating any fractional part). If that fails, or if the
attribute is absent, the default value should be returned instead, or 0 if
there is no default value. On setting, the given value is converted to a
string representing the number in base ten and then that string should be
used as the new content attribute value.
textContent attributeSome elements are defined in terms of their DOM textContent attribute. This is an
attribute defined on the Node interface in DOM3 Core. [DOM3CORE]
Should textContent be defined differently for dir="" and <bdo>? Should we come up with an alternative to textContent that handles those and other things, like alt=""?
Each element in HTML falls into zero or more categories that group elements with similar characteristics together. This specification uses the following categories:
Some elements have unique requirements and do not fit into any particular category.
Block-level elements are used for structural grouping of page content.
There are several kinds of block-level elements:
blockquote, section, article, header.
p, h1-h6, address.
nav, aside, footer, div.
ul, ol, dl, table, script.
There are also elements that seem to be block-level but aren't, such as
body, li, dt, dd, and td. These elements are allowed
only in specific places, not simply anywhere that block-level elements are
allowed.
Some block-level elements play multiple roles. For instance, the
script elements is allowed inside
head elements and can also be used as
inline-level content. Similarly,
the ul, ol, dl,
table, and blockquote
elements play dual roles as both block-level and inline-level elements.
Inline-level content consists of text and various elements to annotate the text, as well as some embedded content (such as images or sound clips).
Inline-level content comes in various types:
a, i, noscript. Elements used in contexts allowing
only strictly inline-level content must not contain anything other than
strictly inline-level content.
ol, blockquote, table.
Unless an element's content model explicitly states that it must contain significant inline content, simply having no text nodes and no elements satisfies an element whose content model is some kind of inline contet.
Some elements are defined to have as a content model significant inline content. This means that at least one descendant of the element must be significant text or embedded content.
Significant text, for the purposes of determining the presence of significant inline content, consists of any character other than those falling in the Unicode categories Zs, Zl, Zp, Cc, and Cf. [UNICODE]
The following three paragraphs are non-conforming because their content model is not satisfied (they all count as empty).
<p></p> <p><em> </em></p> <p> <ol> <li></li> </ol> </p>
Some elements are defined to have content models that allow either
block-level elements or inline-level content, but not both. For
example, the aside and li elements.
To establish whether such an element is being used as a block-level container or as an inline-level container, for example in order to determine if a document conforms to these requirements, user agents must look at the element's child nodes. If any of the child nodes are not allowed in block-level contexts, then the element is being used for inline-level content. If all the child nodes are allowed in a block-level context, then the element is being used for block-level elements.
For instance, in the following (non-conforming) fragment, the li element is being used as an inline-level
element container, because the style
element is not allowed in a block-level context. (It doesn't matter, for
the purposes of determining whether it is an inline-level or block-level
context, that the style element is not
allowed in inline-level contexts either.)
<ol> <li> <p> Hello World </p> <style> /* This example is illegal. */ </style> </li> </ol>
In the following fragment, the aside
element is being used as a block-level container, because even though all
the elements it contains could be considered inline-level elements, there
are no nodes that can only be considered inline-level.
<aside> <ol> <li> ... </li> </ol> <ul> <li> ... </li> </ul> </aside>
On the other hand, in the following similar fragment, the aside element is an inline-level container,
because the text ("Foo") can only be considered inline-level.
<aside> <ol> <li> ... </li> </ol> Foo </aside>
Certain elements in HTML can be activated, for instance a elements, button elements, or
input elements when their type attribute is set
to radio. Activation of those elements can happen in various
(UA-defined) ways, for instance via the mouse or keyboard.
When activation is performed via some method other than clicking the
pointing device, the default action of the event that triggers the
activation must, instead of being activating the element directly, be the
dispatching of a new event, click,
on the same element, with the mouse-specific fields (button,
screenX, etc) set to zero, and the key fields set according
to the current state of the key input device, if any (false for any keys
that are not available). [DOM3EVENTS]
The default action of this click event, or of the real
click event if the element was activated by clicking a
pointing device, shall be to dispatch yet another event, namely DOMActivate.
It is the default action of that event that then performs the
actual action.
For certain form controls, this process is complicated further by changes that must happen around the click event. [WF2]
Most interactive elements have content models that disallowed nesting interactive elements.
Need to define how default actions actually work. For instance, if you click an event inside a link, the event is triggered on that element, but then we'd like a click is sent on the link itself. So how does that happen? Does the link have a bubbling listener that triggers that second click event? what if there are multiple nested links, which one should we send that event to?
User agents must support the following common attributes on all elements in the HTML namespace (including elements that are not defined to exist by this specification).
id
The element's unique identifier. The value must be unique in the document and must contain at least one character.
If the value is not the empty string, user agents must associate the
element with the given value (exactly) for the purposes of ID matching
(e.g. for selectors in CSS or for the getElementById()
method in the DOM).
Identifiers are opaque strings. Particular meanings should not be
derived from the value of the id
attribute.
When an element has an ID set through multiple methods (for example,
if it has both id and xml:id
attributes simultaneously [XMLID]), then the
element has multiple identifiers. User agents must use all of an HTML
element's identifiers (including those that are in error according to
their relevant specification) for the purposes of ID matching.
title
Advisory information for the element, such as would be appropriate for a tooltip. On a link, this could be the title or a description of the target resource; on an image, it could be the caption or a description of the image; on a paragraph, it could be a footnote or commentary on the text; on a citation, it could be further information about the source; and so forth. The value is text.
If this attribute is omitted from an element, then it implies that the
title attribute of
the nearest ancestor with a title attribute set is also relevant to this
element. Setting the attribute overrides this, explicitly stating that
the advisory information of any ancestors is not relevant to this
element. Setting the attribute to the empty string indicates that the
element has no advisory information.
The link, style, abbr,
and dfn elements define their own title attributes instead of using the global title attribute.
lang (HTML only) and xml:lang (XML only)
The primary language for the element's contents and for any of the element's attributes that contain text. The value must be a valid RFC 3066 language code, or the empty string. RFC3066
If this attribute is omitted from an element, then it implies that the language of this element is the same as the language of the parent element. Setting the attribute to the empty string indicates that the primary language is unknown.
The lang attribute only applies to
HTML documents. Authors must not use the lang attribute in XML documents. Authors must
instead use the xml:lang attribute,
defined in XML. [XML]
To determine the language of a node, user agents must look at the
nearest ancestor element (including the element itself if the node is an
element) that has a lang or xml:lang attribute set. That specifies the
language of the node.
If both the xml:lang attribute and
the lang attribute are set, user agents
must use the xml:lang attribute, and
the lang attribute must be ignored for
the purposes of determining the element's language.
If no explicit language is given for the root element, then language information from a higher-level protocol (such as HTTP), if any, must be used as the final fallback language. In the absence of any language information, the default value is unknown (the empty string).
User agents may use the element's language to determine proper processing or rendering (e.g. in the selection of appropriate fonts or pronounciations, or for dictionary selection).
dir
The element's text directionality. The attribute, if specified, must
have either the literal value ltr or the literal value
rtl.
If the attribute has the literal value ltr, the element's
directionality is left-to-right. If the attribute has the literal value
rtl, the element's directionality is right-to-left. If the
attribute is omitted or has another value, then the directionality is
unchanged.
The processing of this attribute depends on the presentation layer. For example, CSS 2.1 defines a mapping from this attribute to the CSS 'direction' and 'unicode-bidi' properties, and defines rendering in terms of those property.
class
The element's classes. The value must be a list of zero or more words (consisting of one or more non-space characters) separated by one or more spaces.
User agents must assign all the given classes to the element, for the
purposes of class matching (e.g. for selectors in CSS or for the
getElementsByClassName()
method in the DOM).
Unless defined by one of the URIs given in the profile attribute, classes are opaque
strings. Particular meanings must not be derived from undefined values
in the class attribute.
Authors should bear in mind that using the class attribute does not convey any additional
meaning to the element (unless using classes defined by a profile). There is no semantic difference
between an element with a class attribute and one
without. Authors that use classes that are not defined in a
profile should make sure, therefore,
that their documents make as much sense once all class attributes have been removed as they do
with the attributes present.
Event handler attributes aren't handled yet.
The following DOM interface, common to elements in the HTML namespace, provides scripts with convenient access to the content attributes listed above:
interface HTMLElement : Element { attribute DOMString id; attribute DOMString title; attribute DOMString lang; attribute DOMString dir; attribute DOMString className; };
The id attribute must
reflect the content id attribute.
The title
attribute must reflect the content
title attribute.
The lang attribute
must reflect the content lang attribute.
The dir attribute must
reflect the content dir attribute.
The className attribute must reflect the content class attribute.
should also introduce a DOMTokenString accessor for the class attribute
html elementhead element followed by a
body element.HTMLElement.
The html element represents the root
of an HTML document.
Document metadata is represented by metadata
elements in the document's head
element.
head elementhtml
element.
title
element, optionally one base element
(HTML only), and zero or more other metadata
elements (in particular, link, meta,
style, and script).
profile (optional)
interface HTMLHeadElement : HTMLElement { attribute DOMString profile; };
The head element collects the
document's metadata.
The profile attribute must, if
specified, contain a list of zero or more URIs (or IRIs) representing
definitions of classes, metadata names, and link relations. These URIs are
opaque strings, like namespaces; user agents are not expected to determine
any useful information from the resources that they reference.
Each time a class, metadata, or link relationship name that is not
defined by this specification is found in a document, the UA must check
whether any of the URIs in the profile
attribute are known (to the UA) to define that name. The class, metadata,
or link relationship shall then be interpreted using the semantics given
by the first URI that is known to define the name. If the name is not
defined by this specification and none of the specified URIs defines the
name either, then the class, metadata, or link relationship is meaningless
and the UA must not assign special meaning to that name.
If two profiles define the same name, then the semantic is given by the
first URI specified in the profile
attribute. There is no way to use the names from both profiles in one
document.
User agents must ignore all the URIs given in the profile attribute that follow a URI that the UA
does not recognise. (Otherwise, if a name is defined in two profiles, UAs
would assign meanings to the document differently based on which profiles
they supported.)
If a profile's definition introduces new definitions over time, documents that use multiple profiles can change defined meaning over time. So as to avoid this problem, authors are encouraged to avoid using multiple profiles.
The profile
DOM attribute must reflect the
profile content attribute on getting
and setting.
title elementhead element containing no
other title elements.
HTMLElement.
The title element represents the
document's title or name. Authors should use titles that identify their
documents even when they are used out of context, for example in a user's
history or bookmarks, or in search results. The document's title is often
different from its first header, since the first header does not have to
stand alone when taken out of context.
Here are some examples of appropriate titles, contrasted with the top-level headers that might be used on those same pages.
<title>Introduction to The Mating Rituals of Bees</title>
...
<h1>Introduction</h1>
<p>This companion guide to the highly successful
<cite>Introduction to Medieval Bee-Keeping</cite> book is...
The next page might be a part of the same site. Note how the title describes the subject matter unambiguously, while the first header assumes the reader knowns what the context is and therefore won't wonder if the dances are Salsa or Waltz.
<title>Dances used during bee mating rituals</title>
...
<h1>The Dances</h1>
In HTML (as opposed to XHTML), the title element must not contain content other
than text and entities; user agents must parse the element so that
entities are recognised and processed, but all other markup is interpreted
as literal text.
In XHTML, the title element must not
contain any elements.
User agents must concatenate the contents of all the text nodes and
CDATA nodes that are direct children of the title element (ignoring any other nodes such as
comments or elements), in tree order, to get the string to use as the
document's title. User agents should use the document's title when
referring to the document in their user interface.
base elementhead element, before any
elements that use relative URIs, and only if there are no other base elements anywhere in the document. Only in
HTML documents (never in XML documents).
href
(optional)
interface HTMLBaseElement : HTMLElement { attribute DOMString href; };
The base element allows authors to
specify the document's base URI for the purposes of resolving relative
URIs.
The href
content attribute, if specified, must contain a URI (or IRI).
User agents must use the value of the href attribute on the first base element in the document as the document
entity's base URI for the purposes of section 5.1.1 of RFC 2396
("Establishing a Base URI": "Base URI within Document Content"). [RFC2396] Note that this base URI from RFC 2396 is
referred to by the algorithm given in XML Base, which is a normative part of this specification.
If the base URI given by this attribute is a relative URI, it must be resolved relative to the higher-level base URIs (i.e. the base URI from the encapsulating entity or the URI used to retrieve the entity) to obtain an absolute base URI.
The href content
attribute must be reflected by the DOM href attribute.
Authors must not use the base element
in XML documents. Authors should instead use the xml:base
attribute. [XMLBASE]
link elementhead element.
href
(optional)
rel (optional)
media
(optional)
hreflang (optional)
type
(optional)
title
(optional)
interface HTMLLinkElement : HTMLElement { attribute boolean disabled; attribute DOMString href; attribute DOMString rel; attribute DOMString media; attribute DOMString hreflang; attribute DOMString type; };
The LinkStyle
interface defined in DOM2 Style must also be implemented by this
element. [DOM2STYLE]
The link element allows authors to
indicate explicit relationships between their document and other
resources.
The destination of the link is given by the href attribute, which must be a
URI (or IRI). If the href attribute is absent, then the element does
not define a link.
The type of link indicated (the relationship) is given by the value of
the rel attribute.
The allowed values and their meanings are defined
in a later section. If the rel attribute is absent, or if the value used is
not allowed according to the definitions in this specification, then the
element does not define a link.
Two categories of links can be created using the link element. Links to external resources are links to resources
that are to be used to augment the current document, and hyperlinks are links to other
documents. The link types section defines whether
a particular link type is an external resource or a hyperlink. One element
can create multiple links (of which some might be external resource links
and some might be hyperlinks). User agents should process the links on a
per-link basis, not a per-element basis.
The exact behaviour for links to external resources depends on the exact relationship, as defined for the relevant link type. Some of the attributes control whether or not the external resource is to be applied (as defined below). For external resources that are represented in the DOM (for example, style sheets), the DOM representation must be made available even if the resource is not applied. (However, user agents may opt to only fetch such resources when they are needed, instead of pro-actively downloading all the external resources that are not applied.)
Interactive user agents should provide users with a means to follow the hyperlinks
created using the link element, somewhere within their user
interface. The exact interface is not defined by this specification, but
it should include the following information (obtained from the element's
attributes, again as defined below), in some form or another (possibly
simplified), for each hyperlink created with each link element in the document:
rel attribute)
title attribute).
href attribute).
hreflang
attribute).
media attribute).
User agents may also include other information, such as the type of the
resource (as given by the type attribute).
The media
attribute says which media the resource applies to. The value must be a
valid media query. [MQ]
If the link is a hyperlink then the media attribute is
purely advisory, and describes for which media the document in question
was designed.
However, if the link is an external resource
link, then the media attribute is prescriptive. The user agent
must only apply the external resource to views while
their state match the listed media.
The default, if the media attribute is omitted, is all,
meaning that by default links apply to all media.
The hreflang attribute gives the
language of the linked resource. It is purely advisory. The value must be
a valid RFC 3066 language code. RFC3066 User
agents must not consider this attribute authoritative — upon
fetching the resource, user agents must only use language information
associated with the resource to determine its language, not metadata
included in the link to the resource.
The type
attribute gives the MIME type of the linked resource. It is purely
advisory. The value must be a valid MIME type, optionally with parameters.
[RFC2046]
For external resource links, user agents may use the type given in this attribute to decide whether or not to consider using the resource at all. If the UA does not support the given MIME type for the given link relationship, then the UA may opt not to download and apply the resource.
User agents must not consider the type attribute authoritative — upon fetching
the resource, user agents must only use the Content-Type information
associated with the resource to determine its type, not metadata included
in the link to the resource.
If the attribute is omitted, then the UA must fetch the resource to determine its type and thus determine if it supports (and can apply) that external resource.
If a document contains three style sheet links labelled as follows:
<link rel="stylesheet" href="A" type="text/css"> <link rel="stylesheet" href="B" type="text/plain"> <link rel="stylesheet" href="C">
...then a compliant UA that supported only CSS style sheets would fetch
the A and C files, and skip the B file (since text/plain is
not the MIME type for CSS style sheets). For these two files, it would
then check the actual types returned by the UA. For those that are sent
as text/css, it would apply the styles, but for those
labelled as text/plain, or any other type, it would not.
The title
attribute gives the title of the link. With one exception, it is purely
advisory. The value is text. The exception is for style sheet links, where
the title
attribute defines alternate style sheet
sets.
The title attribute on link elements differs from the global title attribute of all the
other elements in that a link without a title does not inherit the title
of the parent element: it merely has no title.
Some versions of HTTP defined a Link: header, to
be processed like a series of link
elements. When processing links, those must be taken into consideration as
well. For the purposes of ordering, links defined by HTTP headers must be
assumed to come before any links in the document, in the order that they
were given in the HTTP entity header. Relative URIs in these headers must
be resolved according to the rules given in HTTP, not relative to base
URIs set by the document (e.g. using a base element or xml:base
attributes). [RFC2616] [RFC2068]
The DOM attributes href, rel, media, hreflang, and type each reflect the respective content attributes of
the same name.
The DOM attribute disabled only applies to
style sheet links. When the link element
defines a style sheet link, then the disabled attribute behaves as defined for the alternate stylesheets DOM. For all
other link elements it must always
return false and must do nothing on setting.
meta elementhead element.
name
(optional)
http-equiv (HTML only, optional)
content
(optional)
interface HTMLMetaElement : HTMLElement { attribute DOMString content; attribute DOMString name; };
The meta element allows authors to
specify document metadata that cannot be expressed using the title, base,
link, style, and script elements. The metadata is expressed in
terms of name/value pairs: the name attribute on the meta element gives the name, and the content
attribute on the same element gives the value.
To set metadata with meta elements,
authors must first specify a profile that defines metadata names, using
the profile attribute. The value of
the name attribute
must be defined by one of the profiles, and the value of the content attribute
must conform to the syntax given by the profile.
How user agents handle metadata set in this way depends on the definitions of the profiles involved.
If a meta element has no name attribute, it does
not set document metadata. If a meta
element has no content attribute, then the value part of the
metadata name/value pair is the empty string.
The DOM attributes name and content reflect the respective content attributes of
the same name.
The meta element may also be used, in
HTML only (not in XHTML) to provide UAs with character encoding
information for the file. To do this, the meta element must be the first element in the
head element, it must have the http-equiv
attribute set to the literal value Content-Type, and must
have the content attribute set to the literal value
text/html; charset= immediately followed by the character
encoding, which must be a valid character encoding name. [IANACHARSET]
When the meta element is used in this
way, there must be no other attributes set on the element. Other than for
giving the document's character encoding in this way, the http-equiv
attribute must not be used.
In XHTML, the XML declaration should be used for inline character encoding information.
Authors should avoid including inline character encoding information.
Character encoding information should instead be included at the transport
level (e.g. using the HTTP Content-Type header).
For HTML, user agents must use the following algorithm in determining the character encoding of a document:
meta element that specifies character encoding
information (as described above), then use that.
ISO-8859-1, windows-1252,
and UTF-8 are recommended as defaults, and can in many cases
be identified by inspection as they have different ranges of valid
bytes).
For XML documents, the algorithm user agents must use to determine the character encoding is given by the XML specification. [XML]
style elementhead element.
type attribute.
type
(optional)
media
(optional)
title
(optional)
interface HTMLStyleElement : HTMLElement { attribute booleandisabled; attribute DOMStringmedia; attribute DOMStringtype; };
The LinkStyle
interface defined in DOM2 Style must also be implemented by this
element. [DOM2STYLE]
The style element allows authors to
embed style information in their documents.
If the type
attribute is given, it must contain a MIME type, optionally with
parameters, that designates a styling language. [RFC2046] If the attribute is absent, the type
defaults to text/css. [RFC2138]
If the UA supports the given styling language, then the UA must use the given styles as appropriate for that language.
When examining types to determine if they support the language, user agents must not ignore unknown MIME parameters — types with unknown parameters must be assumed to be unsupported.
The media
attribute says which media the styles apply to. The value must be a valid
media query. [MQ] User agents must only apply the
styles to views while their state match the listed
media.
The default, if the media attribute is
omitted, is all, meaning that by default styles apply to all
media.
The title attribute on style elements defines alternate style sheet sets. If the
style element has no title attribute,
then it has no title; the title attribute of ancestors does not apply to
the style element.
For styling languages that consist of pure text, user agents must use a
concatenation of the contents of all the text nodes and CDATA nodes that
are direct children of the style
element (ignoring any other nodes such as comments or elements), in tree
order. For XML-based styling languages, user agents must use all the
children nodes of the style element as
the style.
The DOM attributes media and type each reflect the respective content attributes of
the same name.
The DOM disabled attribute behaves
as defined for the alternate stylesheets
DOM.
Sectioning elements are elements that divide the page into, for lack of a better word, sections. This section describes HTML's sectioning elements and elements that support them.
Some elements are scoped to their nearest ancestor
sectioning element. For example, address elements apply just to their section.
For such elements x, the elements that apply to a
sectioning element e are all the x
elements whose nearest sectioning element is e.
body elementhtml
element.
HTMLElement.
The body element represents the main
content of the document.
The body element potentially has a
heading. See the section on headings and
sections for further details.
Some DOM operati