This is a snapshot of an early working draft and has therefore been superseded by the HTML standard.
This document will not be further updated.
Generally, when the specification states that a feature applies to the HTML syntax or the XHTML syntax, it also includes the other. When a feature specifically only applies to one of the two languages, it is called out by explicitly stating that it does not apply to the other format, as in "for HTML, ... (this does not apply to XHTML)".
This specification uses the term document to refer to any use of HTML, ranging from short static documents to long essays or reports with rich multimedia, as well as to fully-fledged interactive applications.
For simplicity, terms such as shown, displayed, and visible might sometimes be used when referring to the way a document is rendered to the user. These terms are not meant to imply a visual medium; they must be considered to apply to other media in equivalent ways.
When an algorithm B says to return to another algorithm A, it implies that A called B. Upon returning to A, the implementation must continue from where it left off in calling B.
The specification uses the term supported when referring to whether a user agent has an implementation capable of decoding the semantics of an external resource. A format or type is said to be supported if the implementation can process an external resource of that format or type without critical aspects of the resource being ignored. Whether a specific resource is supported can depend on what features of the resource's format are in use.
For example, a PNG image would be considered to be in a supported format if its pixel data could be decoded and rendered, even if, unbeknownst to the implementation, the image also contained animation data.
A MPEG4 video file would not be considered to be in a supported format if the compression format used was not supported, even if the implementation could determine the dimensions of the movie from the file's metadata.
The term MIME type is used to refer to what is sometimes called an Internet media type in protocol literature. The term media type in this specification is used to refer to the type of media intended for presentation, as used by the CSS specifications. [RFC2046] [MQ]
A string is a valid MIME type if it matches the
media-type rule defined in section 3.7 "Media
Types" of RFC 2616. [HTTP]
To ease migration from HTML to XHTML, UAs
conforming to this specification will place elements in HTML in the
http://www.w3.org/1999/xhtml namespace, at least for
the purposes of the DOM and CSS. The term "HTML
elements", when used in this specification, refers to any
element in that namespace, and thus refers to both HTML and XHTML
Except where otherwise stated, all elements defined or mentioned
in this specification are in the
http://www.w3.org/1999/xhtml namespace, and all
attributes defined or mentioned in this specification have no
Attribute names are said to be XML-compatible if they
Name production defined in XML, they contain no
U+003A COLON characters (:), and their first three characters are
not an ASCII case-insensitive match for the string
The term XML MIME type is used to refer to the MIME types
application/xml, and any MIME
type whose subtype ends with the four characters "
The term root element, when not explicitly qualified as referring to the document's root element, means the furthest ancestor element node of whatever node is being discussed, or the node itself if it has no ancestors. When the node is a part of the document, then that is indeed the document's root element; however, if the node is not currently part of the document tree, the root element will be an orphaned node.
A node's home subtree is the subtree rooted at that node's root element.
Document of a
Node (such as an
element) is the
Document that the
ownerDocument IDL attribute returns.
When an element's root element is the root
element of a
Document, it is said to be in
Document. An element is said to have been inserted into a
document when its root element changes and is now
the document's root element. Analogously, an element is
said to have been removed from a document when its root
element changes from being the document's root
element to being another element.
Node is in a
Document is always the
Document, and the
ownerDocument IDL attribute thus always returns that
The term tree order means a pre-order, depth-first
traversal of DOM nodes involved (through the
When it is stated that some element or attribute is ignored, or treated as some other value, or handled as if it was something else, this refers only to the processing of the node after it is in the DOM. A user agent must not mutate the DOM in such situations.
The term text node refers to any
CDATASection nodes; specifically, any
Node with node type
CDATA_SECTION_NODE (4). [DOMCORE]
A content attribute is said to change value only if its new value is different than its previous value; setting an attribute to a value it already has does not change it.
The construction "a
Foo object", where
Foo is actually an interface, is sometimes used instead
of the more accurate "an object implementing the interface
An IDL attribute is said to be getting when its value is being retrieved (e.g. by author script), and is said to be setting when a new value is assigned to it.
If a DOM object is said to be live, then that means that any attributes returning that object must always return the same object (not a new object each time), and the attributes and methods on that object must operate on the actual underlying data, not a snapshot of the data.
The terms fire and dispatch are used interchangeably in the context of events, as in the DOM Events specifications. [DOMEVENTS]
The term plugin is used to mean any content handler for Web content types that are either not supported by the user agent natively or that do not expose a DOM, which supports rendering the content as part of the user agent's interface.
Typically such content handlers are provided by third parties.
One example of a plugin would be a PDF viewer that is instantiated in a browsing context when the user navigates to a PDF file. This would count as a plugin regardless of whether the party that implemented the PDF viewer component was the same as that which implemented the user agent itself. However, a PDF viewer application that launches separate from the user agent (as opposed to using the same interface) is not a plugin by this definition.
This specification does not define a mechanism for interacting with plugins, as it is expected to be user-agent- and platform-specific. Some UAs might opt to support a plugin mechanism such as the Netscape Plugin API; others might use remote content converters or have built-in support for certain types. [NPAPI]
Browsers should take extreme care when interacting with external content intended for plugins. When third-party software is run with the same privileges as the user agent itself, vulnerabilities in the third-party software become as dangerous as those in the user agent.
The preferred MIME name of a character encoding is the name or alias labeled as "preferred MIME name" in the IANA Character Sets registry, if there is one, or the encoding's name, if none of the aliases are so labeled. [IANACHARSET]
An ASCII-compatible character encoding is a single-byte or variable-length encoding in which the bytes 0x09, 0x0A, 0x0C, 0x0D, 0x20 - 0x22, 0x26, 0x27, 0x2C - 0x3F, 0x41 - 0x5A, and 0x61 - 0x7A, ignoring bytes that are the second and later bytes of multibyte sequences, all correspond to single-byte sequences that map to the same Unicode characters as those bytes in ANSI_X3.4-1968 (US-ASCII). [RFC1345]
This includes such encodings as Shift_JIS, HZ-GB-2312, and variants of ISO-2022, even though it is possible in these encodings for bytes like 0x70 to be part of longer sequences that are unrelated to their interpretation as ASCII. It excludes such encodings as UTF-7, UTF-16, GSM03.38, and EBCDIC variants.
The term Unicode character is used to mean a Unicode scalar value (i.e. any Unicode code point that is not a surrogate code point). [UNICODE]
All diagrams, examples, and notes in this specification are non-normative, as are all sections explicitly marked non-normative. Everything else in this specification is normative.
The key words "MUST", "MUST NOT", "REQUIRED", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in the normative parts of this document are to be interpreted as described in RFC2119. For readability, these words do not appear in all uppercase letters in this specification. [RFC2119]
Requirements phrased in the imperative as part of algorithms (such as "strip any leading space characters" or "return false and abort these steps") are to be interpreted with the meaning of the key word ("must", "should", "may", etc) used in introducing the algorithm.
This specification describes the conformance criteria for user agents (relevant to implementors) and documents (relevant to authors and authoring tool implementors).
There is no implied relationship between document conformance requirements and implementation conformance requirements. User agents are not free to handle non-conformant documents as they please; the processing model described in this specification applies to implementations regardless of the conformity of the input documents.
User agents fall into several (overlapping) categories with different conformance requirements.
Web browsers that support the XHTML syntax must process elements and attributes from the HTML namespace found in XML documents as described in this specification, so that users can interact with them, unless the semantics of those elements have been overridden by other specifications.
A conforming XHTML processor would, upon
finding an XHTML
script element in an XML document,
execute the script contained in that element. However, if the
element is found within a transformation expressed in XSLT
(assuming the user agent also supports XSLT), then the processor
would instead treat the
script element as an opaque
element that forms part of the transform.
User agents that support scripting must also be conforming implementations of the IDL fragments in this specification, as described in the Web IDL specification. [WEBIDL]
Unless explicitly stated, specifications that
override the semantics of HTML elements do not override the
requirements on DOM objects representing those elements. For
script element in the example above
would still implement the
User agents that process HTML and XHTML documents purely to render non-interactive versions of them must comply to the same conformance criteria as Web browsers, except that they are exempt from requirements regarding user interaction.
Typical examples of non-interactive presentation user agents are printers (static UAs) and overhead displays (dynamic UAs). It is expected that most static non-interactive presentation user agents will also opt to lack scripting support.
A non-interactive but dynamic presentation UA would still execute scripts, allowing forms to be dynamically submitted, and so forth. However, since the concept of "focus" is irrelevant when the user cannot interact with the document, the UA would not need to support any of the focus-related DOM APIs.
Implementations that do not support scripting (or which have their scripting features disabled entirely) are exempt from supporting the events and DOM interfaces mentioned in this specification. For the parts of this specification that are defined in terms of an events model or in terms of the DOM, such user agents must still act as if events and the DOM were supported.
Scripting can form an integral part of an application. Web browsers that do not support scripting, or that have scripting disabled, might be unable to fully convey the author's intent.
Conformance checkers must verify that a document conforms to
the applicable conformance criteria described in this
specification. Automated conformance checkers are exempt from
detecting errors that require interpretation of the author's
intent (for example, while a document is non-conforming if the
content of a
blockquote element is not a quote,
conformance checkers running without the input of human judgement
do not have to check that
blockquote elements only
contain quoted material).
Conformance checkers must check that the input document conforms when parsed without a browsing context (meaning that no scripts are run, and that the parser's scripting flag is disabled), and should also check that the input document conforms when parsed with a browsing context in which scripts execute, and that the scripts never cause non-conforming states to occur other than transiently during script execution itself. (This is only a "SHOULD" and not a "MUST" requirement because it has been proven to be impossible. [COMPUTABLE])
The term "HTML5 validator" can be used to refer to a conformance checker that itself conforms to the applicable requirements of this specification.
XML DTDs cannot express all the conformance requirements of this specification. Therefore, a validating XML processor and a DTD cannot constitute a conformance checker. Also, since neither of the two authoring formats defined in this specification are applications of SGML, a validating SGML system cannot constitute a conformance checker either.
To put it another way, there are three types of conformance criteria:
A conformance checker must check for the first two. A simple DTD-based validator only checks for the first class of errors and is therefore not a conforming conformance checker according to this specification.
Applications and tools that process HTML and XHTML documents for reasons other than to either render the documents or check them for conformance should act in accordance with the semantics of the documents that they process.
A tool that generates document outlines but increases the nesting level for each paragraph and does not increase the nesting level for each section would not be conforming.
Authoring tools and markup generators must generate conforming documents. Conformance criteria that apply to authors also apply to authoring tools, where appropriate.
Authoring tools are exempt from the strict requirements of using elements only for their specified purpose, but only to the extent that authoring tools are not yet able to determine author intent.
For example, it is not conforming to use an
address element for arbitrary contact information;
that element can only be used for marking up contact information
for the author of the document or section. However, since an
authoring tool is likely unable to determine the difference, an
authoring tool is exempt from that requirement.
In terms of conformance checking, an editor has to output documents that conform to the same extent that a conformance checker will verify.
When an authoring tool is used to edit a non-conforming document, it may preserve the conformance errors in sections of the document that were not edited during the editing session (i.e. an editing tool is allowed to round-trip erroneous content). However, an authoring tool must not claim that the output is conformant if errors have been so preserved.
Authoring tools are expected to come in two broad varieties: tools that work from structure or semantic data, and tools that work on a What-You-See-Is-What-You-Get media-specific editing basis (WYSIWYG).
The former is the preferred mechanism for tools that author HTML, since the structure in the source information can be used to make informed choices regarding which HTML elements and attributes are most appropriate.
However, WYSIWYG tools are legitimate. WYSIWYG tools should use
elements they know are appropriate, and should not use elements
that they do not know to be appropriate. This might in certain
extreme cases mean limiting the use of flow elements to just a few
span and making liberal use of the
All authoring tools, whether WYSIWYG or not, should make a best effort attempt at enabling users to create well-structured, semantically rich, media-independent content.
Some conformance requirements are phrased as requirements on elements, attributes, methods or objects. Such requirements fall into two categories: those describing content model restrictions, and those describing implementation behavior. Those in the former category are requirements on documents and authoring tools. Those in the second category are requirements on user agents. Similarly, some conformance requirements are phrased as requirements on authors; such requirements are to be interpreted as conformance requirements on the documents that authors produce. (In other words, this specification does not distinguish between conformance criteria on authors and conformance criteria on documents.)
Conformance requirements phrased as algorithms or specific steps may be implemented in any manner, so long as the end result is equivalent. (In particular, the algorithms defined in this specification are intended to be easy to follow, and not intended to be performant.)
User agents may impose implementation-specific limits on otherwise unconstrained inputs, e.g. to prevent denial of service attacks, to guard against running out of memory, or to work around platform-specific limitations.
For compatibility with existing content and prior specifications, this specification describes two authoring formats: one based on XML (referred to as the XHTML syntax), and one using a custom format inspired by SGML (referred to as the HTML syntax). Implementations may support only one of these two formats, although supporting both is encouraged.
The language in this specification assumes that the user agent expands all entity references, and therefore does not include entity reference nodes in the DOM. If user agents do include entity reference nodes in the DOM, then user agents must handle them as if they were fully expanded when implementing this specification. For example, if a requirement talks about an element's child text nodes, then any text nodes that are children of an entity reference that is a child of that element would be used as well. Entity references to unknown entities must be treated as if they contained just an empty text node for the purposes of the algorithms defined in this specification.
This specification relies on several other underlying specifications.
Implementations that support the XHTML syntax must support some version of XML, as well as its corresponding namespaces specification, because that syntax uses an XML serialization with namespaces. [XML] [XMLNS]
The Document Object Model (DOM) is a representation — a model — of a document and its content. The DOM is not just an API; the conformance criteria of HTML implementations are defined, in this specification, in terms of operations on the DOM. [DOMCORE]
Implementations must support some version of DOM Core and DOM Events, because this specification is defined in terms of the DOM, and some of the features are defined as extensions to the DOM Core interfaces. [DOMCORE] [DOMEVENTS]
The IDL fragments in this specification must be interpreted as required for conforming IDL fragments, as described in the Web IDL specification. [WEBIDL]
Except where otherwise specified, if an IDL
attribute that is a floating point number type (
float) is assigned an Infinity or Not-a-Number
(NaN) value, a
NOT_SUPPORTED_ERR exception must be
Except where otherwise specified, if a method with an argument
that is a floating point number type (
is passed an Infinity or Not-a-Number (NaN) value, a
NOT_SUPPORTED_ERR exception must be raised.
rather than the official term ECMAScript, since the term
commonly used type, despite it
being an officially obsoleted type according to RFC
Implementations must support some version of the Media Queries language. [MQ]
Implementations must support the the semantics of URLs defined in the URI and IRI specifications, as well as the semantics of IDNA domain names defined in the Internationalizing Domain Names in Applications (IDNA) specification. [RFC3986] [RFC3987] [RFC3490]
This specification might have certain additional requirements on character encodings, image formats, audio formats, and video formats in the respective sections.
Vendor-specific proprietary extensions to this specification are strongly discouraged. Documents must not use such extensions, as doing so reduces interoperability and fragments the user base, allowing only users of specific user agents to access the content in question.
If vendor-specific markup extensions are needed, they should be done using XML, with elements or attributes from custom namespaces. If such DOM extensions are needed, the members should be prefixed by vendor-specific strings to prevent clashes with future versions of this specification. Extensions must be defined so that the use of extensions neither contradicts nor causes the non-conformance of functionality defined in the specification.
For example, while strongly discouraged from doing so, an
implementation "Foo Browser" could add a new IDL attribute "
fooTypeTime" to a control's DOM interface that
returned the time it took the user to select the current value of a
control (say). On the other hand, defining a new control that
appears in a form's
array would be in violation of the above requirement, as it would
violate the definition of
elements given in this
When vendor-neutral extensions to this specification are needed, either this specification can be updated accordingly, or an extension specification can be written that overrides the requirements in this specification. When someone applying this specification to their activities decides that they will recognise the requirements of such an extension specification, it becomes an applicable specification for the purposes of conformance requirements in this specification.
User agents must treat elements and attributes that they do not understand as semantically neutral; leaving them in the DOM (for DOM processors), and styling them according to CSS (for CSS processors), but not inferring any meaning from them.
Comparing two strings in a case-sensitive manner means comparing them exactly, code point for code point.
Comparing two strings in an ASCII case-insensitive manner means comparing them exactly, code point for code point, except that the characters in the range U+0041 to U+005A (i.e. LATIN CAPITAL LETTER A to LATIN CAPITAL LETTER Z) and the corresponding characters in the range U+0061 to U+007A (i.e. LATIN SMALL LETTER A to LATIN SMALL LETTER Z) are considered to also match.
Comparing two strings in a compatibility caseless manner means using the Unicode compatibility caseless match operation to compare the two strings. [UNICODE]
Converting a string to ASCII uppercase means replacing all characters in the range U+0061 to U+007A (i.e. LATIN SMALL LETTER A to LATIN SMALL LETTER Z) with the corresponding characters in the range U+0041 to U+005A (i.e. LATIN CAPITAL LETTER A to LATIN CAPITAL LETTER Z).
Converting a string to ASCII lowercase means replacing all characters in the range U+0041 to U+005A (i.e. LATIN CAPITAL LETTER A to LATIN CAPITAL LETTER Z) with the corresponding characters in the range U+0061 to U+007A (i.e. LATIN SMALL LETTER A to LATIN SMALL LETTER Z).
A string pattern is a prefix match for a string s when pattern is not longer than s and truncating s to pattern's length leaves the two strings as matches of each other.