EPUB converter

Steve Litt slitt at troubleshooters.com
Fri Mar 25 22:09:56 UTC 2022


Pavel Sanda via lyx-users said on Fri, 25 Mar 2022 09:20:15 +0100


>That said, the unfortunate news is that speedy prototyping in external
>python script is probably going to share the fate of other scripts of
>this sort (elyxer being last but not the only one): cool and perhaps
>better than the output we currently have, but in long-term
>uncompatible and abandoned, because it does not internally share the
>codebase with LyX itself.

Or, the LyX project could fulfill their goal of making LyX' native
format an XML dialect, and then any developer could convert a LyX doc
to any other format quite easily.

The trouble is, in the early 00's, the LyX project decided to make the
LyX native format an XML dialect, which greatly increased the
difficulty of parsing a LyX file, but did not carry through in making
LyX files well formed XML. If they'd carried through with making it
XML, converting it to anything else would be relatively simple. As of
now, it's the worst of both worlds: Difficult parsing because of some
XMLisms, but not parsable by an XML parser because it's not well formed
XML.

One more thing. In my opinion LyX' HTML export suffers not from
technical deficiencies, but from deficiencies of specification. Please,
don't do us the "favor" of adding appearances to the HTML. Instead,
just pass the styles through as-is, and let people like Ken Kopelson
and me handle conversion of style to appearance, which is done simply
with CSS. When you pass appearances instead of styles into the HTML,
you're doing extra work, and sabotaging us.

And please make the output well-formed XML as well as HTML5. HTML5 can,
but doesn't have to be, well formed XML. It's a couple orders of
magnitude easier to deal with if the exported HTML has all opening tags
accompanied by closing tags, and for tags that both open and close
(<br/> for instance), be sure to put the trailing slash.

Also, please either give the output file an XML DTD/schema that defines
HTML characters like   , or else just output their numeric
equivalents, eg.   . Ken Kopelson, do you agree with this
paragraph?

If you want to go the extra mile in making things easier for people
writing LyXHTML to ePub converters, a nice but by no means necessary
favor you could do us is to output a CSS file listing all the styles in
the document, and perhaps giving some best-guess appearances for each.
Or else make them all big and red, so the self-published author can
easily specify each later on. But please, please, PLEASE, do not throw
in an all-possible-styles CSS file that bloats up our books and is
extremely difficult to deal with. I'd rather personally write an XML
parser that looks at the XHTML5 file and outputs the CSS.

All previous attempts have considered the exported HTML to be the final
file for reading. This is clearly false: It's an intermediate file, and
as such, should be very easy to parse (do the slight extra work to make
it well-formed XML), and pass ONLY styles into it, no appearances.

And this thing where standard paragraph environment translated to two
different <p/>, one for the first line and one for all the rest so that
the first line isn't indented, please don't. This is easily done in CSS,
and even if it weren't, we converter makers could easily write a
converter program to change it to two different <p/>, AT THE VERY LAST
PASS before outputting the file intended for the reader.

And please, don't throw in all sorts of extraneous <div/> elements like
previous attempts have done. If fifteen consecutive paragraphs are in
the, let's say for example, "story" environment, just begin each
paragraph of the output with <p class="story"> instead of putting them
all in a <div class="story"/>. I've seen past LyXHTML go several levels
deep in unnecessary <div/> elements. Life shouldn't be that difficult.

Once again, all past attempts at LyXHTML have unnecessarily bitten off
way more than they could chew. Just pass us the styles, and we'll take
care of the style to appearance translation, *at the right time*!

SteveT

Steve Litt 
March 2022 featured book: Making Mental Models: Advanced Edition
http://www.troubleshooters.com/mmm


More information about the lyx-users mailing list