Given a set of 100 different strings of equal length, how can you quantify the probability that a SHA1 digest collision for the strings is unlikely... ?

clarify, how can you have 'different but equal length' strings?– KevinDTimmDec 8 '09 at 14:13

5@kevindtimm "a", "b", "c" are equal length but different strings– Joe PhillipsDec 8 '09 at 14:16

4@anthony, why is that obvious? I don't know if that's true– Joe PhillipsDec 8 '09 at 14:19

3doh, upon rereading, it's perfectly clear.– KevinDTimmDec 8 '09 at 15:17

1This is relevant to this question: sites.google.com/site/itstheshappening At least one collision has been found.– Christophe De TroyerOct 8 '15 at 21:44
Are the 160 bit hash values generated by SHA1 large enough to ensure the fingerprint of every block is unique? Assuming random hash values with a uniform distribution, a collection of n different data blocks and a hash function that generates b bits, the probability p that there will be one or more collisions is bounded by the number of pairs of blocks multiplied by the probability that a given pair will collide.
(source : http://bitcache.org/faq/hashcollisionprobabilities)

13In conclusion, the likelihood of a collision is in the order of 10^45. That is very, very unlikely. Dec 8 '09 at 14:41

4But SHA1 is not uniform distribution, it could be bigger than this upper bound. I doubt that this equation is not correct. AS least the EQUAL.– KamelApr 11 '15 at 2:12

1This answer doesn't take into account the chinese discovery in 2005 where they are able to produce collisions in 2^69 iterations rather than the 2^80 projected by brute force schneier.com/blog/archives/2005/02/sha1_broken.html– DjaridApr 18 '16 at 14:09

5@Djarid It's important not to confound accidental hash collision and adversarial collision hunting. The former is the probability that the hash of two items will collide, and follows the formula above (although, as noted by Kamel, the distribution is not perfectly uniform and thus the probability is likely higher). The latter is used to intentionally try to find collisions, and relies on discovering and exploiting weaknesses in the hash. Cryptographic hashes attempt to be robust against such attacks, but often they are overkill for simpler hashing applications (not transmitting secrets).– Pierre DFeb 19 '17 at 18:01

that formula is accurate when 2^b >> n^2 (and when 2^b very big). I know it is the mayority (if not all) the cases for sha1... but for the record!– MrIoDec 8 '20 at 22:21
Well, the probability of a collision would be:
1  ((2^160  1) / 2^160) * ((2^160  2) / 2^160) * ... * ((2^160  99) / 2^160)
Think of the probability of a collision of 2 items in a space of 10. The first item is unique with probability 100%. The second is unique with probability 9/10. So the probability of both being unique is 100% * 90%
, and the probability of a collision is:
1  (100% * 90%), or 1  ((10  0) / 10) * ((10  1) / 10), or 1  ((10  1) / 10)
It's pretty unlikely. You'd have to have many more strings for it to be a remote possibility.
Take a look at the table on this page on Wikipedia; just interpolate between the rows for 128 bits and 256 bits.
That's Birthday Problem  the article provides nice approximations that make it quite easy to estimate the probability. Actual probability will be very very very low  see this question for an example.