Tweaking lib/symbols for XML entities

Richard Kimberly Heck rikiheck at lyx.org
Sun May 10 18:16:09 UTC 2020


On 5/9/20 9:25 PM, Thibaut Cuvelier wrote:
> Dear list,
>
> In order to ensure a valid DocBook entity with math formulae, the
> MathML generator must produce valid XML. Right now, it "only" produces
> valid HTML (which is already quite an achievement!). The difference is
> in the entities: in HTML, you can use many entities, like ∑. This
> is no more the case in XML, where you have to define all entities
> (that is, besides <, >, &, ", '). A solution for
> DocBook would be to define the needed entities in the XML document,
> but that would require generating all math formulas, remembering the
> needed entities, then output the mapping at the /beginning/ of the XML
> document.

We do this kind of thing already: The validate() routines collect
various information that needs to be output to the document preamble.
For LaTeX, for example, we need to know whether to load various
packages, so e.g. the various insets tell us what they require. Whether
that's the right way to proceed here is not clear. You'd have, in
effect, to construct the XML and note which entities were used and then
construct it again for actual output. But it certainly could be done.


> There are mostly two places where these entities are hard-coded in
> LyX: InsetMathDecoration, with only a few entities hard-coded in
> source code;

I should move those to lib/symbols!


> lib/symbols, a much harder thing to change.
>
> Here is what I came up with:
> https://gitlab.com/gadmm/lyx-unstable/-/merge_requests/3/diffs?commit_id=0c0fc7624caad400f22072442f9132291ee3036d#e90e8f11b4a89e64b3c66669958e7af650b2f526.
> It adds a parameter to MathStream to enable outputting XML-valid
> entities. Mappings for InsetMathDecoration are done by slightly
> adapting the data structure. However, for the other entities, I
> hard-coded a mapping in InsetMathSymbol (hundreds of entities…),
> because I could not get my head around lib/symbols. (By the way, in
> this file, are the "x" mappings symbols that are not yet allowed in
> output?)

Yes, the x just means that we don't have anything (at the moment) we can
use for output.


> Would the patch be acceptable as-is?
> Otherwise, could a lib/symbols expert (I've heard that there might be
> one roaming around) help me with this? As I understand it, it would be
> adding a new column in this file to propose an XML entity after the
> HTML one.

It probably would be better to do this in lib/symbols, since otherwise
we have this same kind of information spread out in different places. It
probably wouldn't be that hard to change it. It is read by initSymbols
in MathFactory.cpp. All the 'character' lines would need an extra
column, and this bit of code:

            is >> charid >> fallbackid >> tmp.extra >> tmp.xmlname;

would need to be adapted to read it, with the latexkeys class (in
MathParser.h) picking up an extra member. (That might as well just be a
struct.)

>
> I also attach two patches for MathStream: the second one is my current
> tentative of implementing XML entities; the first one is about adding
> XML-name-spaces support (and not really related to the question above,
> but the second one relies on it to avoid conflicts when merging).

Am I right that the first patch, as it is, just allows for namespaces
and doesn't actually use them? It's pretty long but seems to be
straightforward, really.

Riki


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lyx.org/pipermail/lyx-devel/attachments/20200510/47eae92b/attachment.html>


More information about the lyx-devel mailing list