This Open eBook edition of The Devil’s Dictionary was begun as a way for me to learn the Open eBook (OEB) structure and how to write clean XHTML that duplicates the original formatting of the typeset edition.
Having hit the limitations of the OEB format and current OEB readers in this attempt, I am posting this early version of my conversion effort as a test document that illustrates the shortcomings of the format and is meant to encourage the developers to address these issues in forthcoming versions of their software and the OEB specification itself.
The most difficult problem I have faced in formatting The Devil’s Dictionary has been poetry. The print copy I own has the poems formatted so that the attribution line is right justified with the end of the longest line of the poem, no stanza is broken across pages, and the whole thing is centered within the margins of the body text. This is a very natural way to format the poetry, yet it is impossible to duplicate this structure with the current eBook readers—most notably, with Microsoft Reader.
First, the only
way to create the desired justification and centering with HTML is to place the whole poem inside one table. This
works for small poems, but not for larger ones because MS Reader cuts off all text in a table cell when the end
of the page is reached, preventing long poems from being displayed in their entirity. Additionally, if each stanza
is placed inside a pair of paragraph tags (as would seem natural) many of the indents must be accomplished by
adjusting the left margin of that individual line with a <span>
tag. This should work, since
both this tag and the left margin property are applied to all elements (block and inline) according to the HTML and
CSS specifications. MS Reader, however, ignores this instruction. An example of this formatting
is found in the “A” section of the Dictionary.
An alternate way to format the poems is to enclose each poem in a <blockquote>
tag,
each line in its own paragraph tag (with different CSS classes to handle the needed indents and close up
the line spacing) and, each stanza in a <span>
tag (with the CSS page-break-after property set
to avoid breaking across pages). However, the blockquote’s margins causes many poems towrap, does not
center the poem, places the attribution line (and any right-justified lines of the poem) almost at the right margin
of the book (sometimes far away from the poem itself), and MS Reader ignores the instructions to not
wrap the stanzas. This method is demonstrated in the “B” section of the Dictionary.
As I was writing this, I thought of what should have been an obvious construct for these poems: putting each stanza in a separate table cell. This solves many, but not all, of the problems described above. For poems with short- or medium-length stanzas viewed with the PC version of MS Reader on a large-screen laptop it should work fine. But for a PocketPC, or even for poems with long single stanzas on a PC, the bottom of each long stanza will still be lost. You can see the results of this experiment in the “C” section of the Dictionary.
These issues can best be demonstrated by one representative poem in each of the first three sections, when
reading the book in the desktop version of MS Reader. Abracadabra should
be separated into stanzas with 1em of space between each, but since Reader ignores the <span>
tag, it is just one long block. The poem cited under the definition of beg exemplifies
the problems with the wide right margin described above. Although not perfect, the poem cited under
carmelite is presented almost exactly as it should be. The poem is properly
centered, the indents and right justification appear as intended, and the poem is broken across pages only
between stanzas. But when viewed on a smaller screen (almost certainly with a Pocket PC) the first stanza
alone will likely be cut off.
A major additional problem, not specific to this book, is the inability of any current OEB reader to handle Unicode text, as mandated in the OEB specification. An example of how such a Unicode document appears is demonstrated in sections “D” (UTF-8) and “E” (UTF-16) of the Dictionary. Notice that the Unicode signature/byte-order mark which appears at the beginning of each of these files causes problems with both the readers and with the authoring tools. The MobiPocket Publisher can not complete the conversion process at all, and while ReaderWorks handles both relatively OK, MS Reader can not display UTF-8 files correctly (the Unicode signature causes it to ignore all CSS formatting and UTF-8 characters are displayed as their literal byte sequence, something specifically forbidden by the OEB specification) and the whole section “E” disappears because of the byte-order mark.
Most sections beyond E have not yet been fully formatted, so please do not expect them to look pretty.
Another goal is much broader. I have long known of Project Gutenberg, but have always found its insistence on plain ASCII to be a handicap that limited its appeal and usability. Don’t get me wrong—the effort has provided a tremendous resource, and at the time the project was begun (and until very recently) plain ASCII was clearly the best choice. But you can’t properly format a book with just ASCII characters. Not only must basic things such as *bold* and _italics_ be indicated in a funky manner, it is simply impossible to preserve the accented characters, ligatures, and many other important features. And trying to display such a work legibly on a PDA or eBbook reader with a small screen is impossible, given the hard line breaks that are present (keeping the text from flowing properly).
With is footing solidly in HTML and XML and its completely open nature, the Open eBook format is the ideal structure in which to continue the goals of Project Gutenberg on into the 21st century. So this edition of The Devil’s Dictionary is not meant just as a personal learning project, but as an example of the benefits to offering current and future editions as Open eBooks. I don’t dispute the benefits of the current plain ASCII versions, but with the right automation tools, future editions could begin as Open eBooks and then be converted to plain ASCII, making both versions available without duplicated effort. This would be far preferable to starting with plain ASCII versions and converting them to Open eBook. This is the method I obviously used for this edition, and I assure you that it is quite tedious and not well-suited as a standard practice.
Peter K. Sheerin