A fast mostly collision free hash ?

Jean-Marc Lasgouttes lasgouttes at lyx.org
Wed Nov 9 11:32:30 UTC 2022


Le 08/11/2022 à 23:06, Thibaut Cuvelier a écrit :
> Probably, the less fancy option is to use std::hash, available since 
> C++11. I have no idea about the quality of the produced hashes, and it 
> seems it might really depend on the compiler too.

This seems hazardous.

> I can totally understand the need for very unique hashes in this case 
> (unlike hash tables). If you want something with unicity guarantees, the 
> only meaningful choice is cryptographic hashes: other functions do not 
> have the right properties (easy to compute, but no real guarantee with 
> small hashes). Something like BLAKE2/3 should be good, performance-wise, 
> and it's available in Qt since Qt 6 
> (https://doc.qt.io/qt-6/qcryptographichash.html#Algorithm-enum 
> <https://doc.qt.io/qt-6/qcryptographichash.html#Algorithm-enum>); 
> otherwise, Keccak since 5.9; or SHA-3 before.

Would SHA1 work? I do not care that it has been proven fragile, I do not 
have an opponent here. At least it also exists in Qt5 (along with md4 
and md5, which re not usable if I understand).

> Otherwise, you might consider using several hashes (concatenating them): 
> I think it should provide enough entropy to have a very low probability 
> of collision, but the hash algorithms must be really different for this 
> to be worthwhile. (Just starting with std::hash and the string length, 
> then see if more is required?)

Yes, but how do I see that more is required ? Because I have a bug 
report that some particular document is not rendered correctly? I'd 
rather avoid that.

JMarc



More information about the lyx-devel mailing list