Same commands for different unicodes?

Kornel Benko kornel at lyx.org
Sun Feb 20 22:41:11 UTC 2022


Am Sun, 20 Feb 2022 23:39:00 +0100
schrieb Thibaut Cuvelier <tcuvelier at lyx.org>:

> On Sun, 20 Feb 2022 at 21:53, Thibaut Cuvelier <tcuvelier at lyx.org> wrote:
> 
> > On Sun, 20 Feb 2022 at 17:39, Kornel Benko <kornel at lyx.org> wrote:
> >
> >> Am Sun, 20 Feb 2022 17:04:54 +0100
> >> schrieb Thibaut Cuvelier <tcuvelier at lyx.org>:
> >>
> >> > On Sun, 20 Feb 2022 at 13:12, Kornel Benko <kornel at lyx.org> wrote:
> >> >
> >> > > In unicodesymbols we find
> >> > >
> >> > > 0x025b "\\textepsilon"            "tipa" ...
> >> > > 0x03b5 "\\textepsilon"      "textgreek" ...
> >> > >
> >> >
> >> > 0x03b5 is a true epsilon (
> >> https://unicodemap.org/details/0x03B5/index.html),
> >> > i.e. a letter in the Greek alphabet, while 0x025b is only something that
> >> > looks like an epsilon (https://unicodemap.org/details/0x025b/index.html
> >> ),
> >> > an IPA symbol. For the latter (0x025b), it's rather an "open-mid front
> >> > unrounded vowel" (according to
> >> > https://upload.wikimedia.org/wikipedia/commons/8/8f/IPA_chart_2020.svg
> >> ).
> >> > Although the TIPA package is using \textepsilon to enter this character
> >> (
> >> > https://mirror.lyrahosting.com/CTAN/fonts/tipa/tipaman.pdf, page 33),
> >> so
> >> > I'm not sure there's anything to correct.
> >> >
> >> >
> >> > > 0x204e "\\textasteriskcentered"   "textcomp" ...
> >> > > 0x*2217* "\\textasteriskcentered"   "textcomp" ...
> >> > >
> >> >
> >> > According to Wikipedia (https://en.wikipedia.org/wiki/Asterisk),
> >> 0x204e is
> >> > a "low asterisk" and 0x2217 is the "asterisk operator". It looks like
> >> > \textasteriskcentered should output a 0x2217 (based my understanding of
> >> > http://hevea.inria.fr/examples/test/sym.html) and \textasterisklow a
> >> 0x204e
> >> > (https://www.johndcook.com/unicode_latex.html: it's recognised by
> >> MathGL
> >> > http://mathgl.sourceforge.net/docs_v1/mathgl_en_10.html and STIX
> >> > http://www.ams.org/STIX/bnb/stix-tbl-2006-10-18.asc). I'd say this is a
> >> > mistake in unicodesymbols.
> >> >
> >> > For the math mode, these two symbols are found as \ast, I have no idea
> >> > about the semantic difference with the character * (0x002a): probably
> >> more
> >> > the operator, because it's usually used as times for calculators…
> >>
> >> My problem is more how to handle such cases (there are 44 conflicts in
> >> unicodesymbols).
> >>
> >> Say, we search for '⁎' (== 0x204e),
> >> lyx outputs \textasteriskcentered
> >> and lyxfind.cpp uses '∗' (== 0x2217)
> >>
> >> This means, we cannot find this char.
> >>
> >> I am not interested in the meaning of these unicode chars. The problem
> >> for findadv is that
> >> there are latex commands which create different unicode depending on moon
> >> phase.
> >>
> >
> > Based on my understanding of this issue, there will always be some
> > discrepancy, as the mapping depends on the context (text, math, or TIPA,
> > mostly, as I could see). I believe it's hard to mistake the math mapping
> > with the two others, but I don't see a similar way to tell TIPA characters
> > from the others, as it looks like they are entered like normal letters
> > (i.e. not separated like the math mode): it's sure the TIPA mapping is the
> > best one within an IPA inset, but what about outside? I don't know
> > phonetics enough (especially typesetting with LyX) :/.
> >
> > Would you have a script that finds all these occurrences or a list? Maybe
> > quite a few could be resolved like the asterisk.
> >
> 
> Would it be helpful if some duplicate characters were marked as deprecated?
> For \\'\\textalpha, for instance (I guess it's the same for all Greek
> vowels with tonos/oxia), 0x1F71 is disallowed (see line idna2008 in
> https://util.unicode.org/UnicodeJsps/character.jsp?a=1F71), unlike 0x3AC.

That would help. In fact my script already uses this info, but only a very few
codes are marked as such.

	Kornel
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: Digitale Signatur von OpenPGP
URL: <http://lists.lyx.org/pipermail/lyx-devel/attachments/20220220/1a38fc48/attachment.asc>


More information about the lyx-devel mailing list