News: EPUB is now supported. EPUB is preferred and recommended. If you use EPUB, the rest of this article is unnecessary to review.

dslibris understands books stored as EPUB or XHTML.

XHTML should be in UTF-8 encoding with numeric entities. The files must end with the extension ‘.xhtml’ or ‘.xht’ and be saved to the ‘book’ folder on your media.

Converting from HTML to XHTML

Use HTML Tidy to clean up HTML and convert it to XHTML. An online Tidy service at lets you upload an HTML file and get XHTML back.

If you’re using command line tidy, here’s an example:

tidy -asxhtml -utf8 -numeric -o book.xhtml book.html

Also, people have used these programs to save as XHTML:

  • Microsoft Word
  • Amaya
  • AbiWord
  • OpenOffice Writer

Converting from PDF to XHTML

This generally doesn’t work since PDF formats are preformatted assuming a certain page size and so can’t be reliably converted to a form that will flow properly on the DS. If you’re willing to massage the text in a text editor after copying it out of a PDF you can sometimes get a reasonable result.

Converting from TXT to XHTML

As with PDF, the lack of information for reformatting ASCII text files is a problem. If your source is from Project Gutenberg, the Gutenmark project provides programs for generating reasonable HTML from the ASCII text format files. That HTML can then go through Tidy as above.

Of course, those who can write HTML could rewrite text files into HTML.

What a pain! Is there relief in sight?

There are efforts afoot to provide Gutenberg texts in ePub format, and Feedbooks provides ePub material. ePub support is on my wish list for dslibris. Cross-platform tools for generating XHTML and ePub from other formats is also in the works.

