An MD5 Hash value is calculated from an algorithm, which is then calculated from the contents of a file. The MD5 hash calculates 128 bit values. It is unlikely that different documents would have the same hash value (same 128 bit code), so by comparing each hash values against each other, you can identify duplicate documents, emails, etc.