The current situation with ruby annotation in HTML. This has changed quite a bit over the years, so where are we now.
Ruby annotation is a means to annotate runs of text with translations, pronunciation guides, meanings etc, alongside them. Suppose we are showing a piece of Katakana, ハワイ, and want to show how it's pronounced. (This is the inverse of how it normally works, but I don't know enough Japanese to make a plausible example. Sorry.)
The original IE implementation of ruby annotation (just a ruby
element with the ruby text in an rt
element) still works fine in all browsers.
<ruby>ハワイ<rt>hawai</rt></ruby>
This is supposed to look like
and comes out as
Or the version with an explicit rb
(ruby base) element, which makes for easier styling.
<ruby>
<rb>ハワイ</rb>
<rt>hawai</rt></ruby>
This too works fine.
The problems begin when we want to position each of the syllables exactly above their Katakana characters. Like this.
An earlier solution, in XHTML 1.1, was to put each character in its own rb element and put the three elements together in an rbc element, rbc meaning ruby base container. Also put the syllables each in an rt and put those in an rtc (ruby text container).
<ruby>
<rbc>
<rb>ハ</rb>
<rb>ワ</rb>
<rb>イ</rb></rbc>
<rtc>
<rt>ha</rt>
<rt>wa</rt>
<rt>i</rt></rtc></ruby>
The idea being that the browser is supposed to match each of the rb
s to its corresponding rt
.
This never worked well in any browser, though. It came out as
If it had worked, it would also have provided for double-sided ruby, which has both the pronunciation and the semantic meaning for annotations:
with the following markup, using two rtc containers:
<ruby>
<rbc>
<rb>ハ</rb>
<rb>ワ</rb>
<rb>イ</rb></rbc>
<rtc>
<rt>ha</rt>
<rt>wa</rt>
<rt>i</rt></rtc>
<rtc>
<rt>Hawaii</rt></rtc></ruby>
But of course, that didn't work either.
even though it still validates, at least in XHTML 1.1 documents. (XHTML 1.1 was the only legacy doctype that knew about ruby.)
Incidentally, the MDN page about double-sided ruby shows different markup; this.
<ruby>
<rbc>
<rb>ハ</rb>
<rt>ha</rt>
<rb>ワ</rb>
<rt>wa</rt>
<rb>イ</rb>
<rt>i</rt></rbc>
<rtc>
<rt>Hawaii</rt></rtc></ruby>
which comes out looking good in Firefox, but unfortunately not any other browser:
And, just as unfortunate, it doesn't validate, so you'll never know if you're doing it right. Shame.
A complication was the fallback element rp
, that provides a means for graceful degradation of the display for browsers that don't know about ruby. They had difficulties combining rp
with rbc
and rtc
, so they decided to make this combination an error.
Another complication was that it turned out you needed to associate multiple rb
elements with a single rt
element, so they introduced the rbspan
attribute. I will not mention this attribute again.
And this whole system of containers didn't line up with how annotation is actually used; even the nomenclature of simple ruby
and complex ruby
didn't match the types of real world annotation (furigana, bopomofo, jukugo-rubī...)
And I understand there were differences in opinion on the role of the ruby
element itself: some wanted it to be a container that could hold plain text and rb
/rt
pairs, others wanted it to act as if it itself was the ruby base.
In an attempt to solve all those problems, it was decided to throw many of these elements away in an attempt to simplify things, basically to start over. The rb
element, as well as the rbc
and rtc
elements, are being dropped in favour of nested ruby
elements.
So the latest version of the markup looks like this.
<ruby><ruby>
ハ<rt>ha</rt>
ワ<rt>wa</rt>
イ<rt>i</rt></ruby>
<rt>Hawaii</rt></ruby>
This looks promising; it currently works in more browsers than any of the previous examples.
Now personally I'm not too sure about the decision to do away with the rb
element. I know they say it isn't really necessary and that if you need a separate element inside ruby
that you need to style, you can just use a span
. But that is not nearly as semantic, and it's longer to type; ruby markup is large enough as it is.
Anyway, let's hope they'll settle upon a stable standard soon and that the earlier incompatibilities will quickly be forgotten.