A fast mostly collision free hash ?
Jean-Marc Lasgouttes
lasgouttes at lyx.org
Wed Nov 9 11:32:30 UTC 2022
Le 08/11/2022 à 23:06, Thibaut Cuvelier a écrit :
> Probably, the less fancy option is to use std::hash, available since
> C++11. I have no idea about the quality of the produced hashes, and it
> seems it might really depend on the compiler too.
This seems hazardous.
> I can totally understand the need for very unique hashes in this case
> (unlike hash tables). If you want something with unicity guarantees, the
> only meaningful choice is cryptographic hashes: other functions do not
> have the right properties (easy to compute, but no real guarantee with
> small hashes). Something like BLAKE2/3 should be good, performance-wise,
> and it's available in Qt since Qt 6
> (https://doc.qt.io/qt-6/qcryptographichash.html#Algorithm-enum
> <https://doc.qt.io/qt-6/qcryptographichash.html#Algorithm-enum>);
> otherwise, Keccak since 5.9; or SHA-3 before.
Would SHA1 work? I do not care that it has been proven fragile, I do not
have an opponent here. At least it also exists in Qt5 (along with md4
and md5, which re not usable if I understand).
> Otherwise, you might consider using several hashes (concatenating them):
> I think it should provide enough entropy to have a very low probability
> of collision, but the hash algorithms must be really different for this
> to be worthwhile. (Just starting with std::hash and the string length,
> then see if more is required?)
Yes, but how do I see that more is required ? Because I have a bug
report that some particular document is not rendered correctly? I'd
rather avoid that.
JMarc
More information about the lyx-devel
mailing list