Written on April 29th, 2007.

A few days ago I wrote the following message on Twitter:

Reverting site to HTML4. HTML5 isn’t that great after all.

Since then, many people have wondered why I considered HTML4 to be better than the much newer HTML5. Here’s why.

The Obvious Stuff

HTML5 is very new, and still actively being developed, so it is no surprise that browser support is lacking. New elements such as section can not really be used yet; browsers treat these elements wrong and ruin the page layout.

It is also lacking usable validators. There is a HTML5 validator, but it’s a work-in-progress, which means it can behave odd at times.

Lack of browser support and lack of a stable validator are not really flaws of HTML5, but they are hindering adoption. Of course, this may soon be a moot point, considering Apple, Mozilla and Opera are supporting the HTML5 effort.

New Elements

HTML5 introduces new structural elements such as header, footer, aside, article, nav, dialog and section. I was excited the first time I first saw these new elements. I thought we finally had a solution to the ugly div soup.

I gradually became less and less enthousiastic, especially when I started marking up a test version of my site using these elements. I wanted more elements:

I could think of a few more structural elements similar to these. However, adding them to HTML5 would be silly: the specification would be filled with dozens of very similar elements. In fact, I would prefer a version of HTML with less elements instead: adding more elements will only make authoring HTML documents harder.

These elements all have properties similar to a div, so why not use a div instead? Adding a class attribute would achieve the same result as creating new elements.

One problem with class is that two different classes don’t necessarily have a different meaning. For example, "nav", "navigation", "sitenav", "topnav" all mean the same: the element’s role is to be navigational.

I happen to like XHTML2’s role attribute, which describes the role or purpose of an element. Adding such a role attribute to HTML5 would not break backward compatiblity (unlike new elements). For example: instead of using an article element, simply use <div role="article">…</div>.

Having both class and role might be confusing, considering they are similar but not quite the same. A role attribute should only have predefined values, while a class attribute can have any value (but without extra semantics). For example, role="nav" and role="header" would be allowed; role="topnav" and role="siteheader" would not, but class="topnav" and class="siteheader" would.

Predefined Class Names

HTML5 defines several predefined class names: class that should only be used in the way they are defined in the specification. While standardizing the class names may sound like a good idea, it has a few issues I keep struggling with.

These predefined class names are not applicable to all elements. The error class name, for example, can only be used on p, section, span and strong elements—not on em, div or ol. This exception to the rule that any element can have any class is confusing and does not make sense.

In a version of HTML that defines predefined class names, authors are no longer free to use any class name. Even worse: HTML pages may break when the specification changes and predefined class names are added or changed.

As mentioned in the section above, I believe a role attribute with only predefined role names would make a lot more sense, and prevent clashing class names.

Changed Semantics

HTML5 changes the semantics of several elements. The b, i, small, … elements, which had no semantic meaning in HTML4, suddenly receives a rather unsatisfying new meaning:

The i element represents a span of text in an alternate voice or mood, or otherwise offset from the normal prose, such as a taxonomic designation, a technical term, an idiomatic phrase from another language, a thought, a ship name, or some other prose whose typical typographic presentation is italicized.

Forgive me when I translate that as “The i element represents a span of text that is italic”. The b element has the same issue.

The small element is now used for “small print” (boring legal stuff). I’ve personally never seen small being used to mark up legalese in HTML4. Oh, and the example of small is awful:

<p><small>&copy; copyright 2038 Example Corp.</small></p>

I was under the impression I had to use the copyright predefined class name for that, but <small class="copyright">…<small> is not valid…

I think b, i, small, font and friends should simply be deprecated (perhaps even removed) instead of given a new meaning.

HTML5 Sucks

The title of this article was merely there to attract attention. HTML5 has its flaws, but doesn’t really suck that much. :)

There are several changes I like a lot. Removing acronym was a good choice. I like the new audio and video elements—object is way too abstract for that. The m element for marking highlighted text is also a useful addition.

Generally speaking, I am happy with the progress the WHATWG is making with HTML5. It has its flaws, but it still is a work-in-progress and I’m confident that HTML5 will end up being something a lot better than HTML4.

But until that day, I’ll stick with HTML4.