Fingerprint is a signature of the document. In particular, it is a representative subset of hash values from the set of all hash values of a document. For more detail, please consider taking a look at Winnowing: Local Algorithms for Document Fingerprinting (specifically Figure 2).
The recommended way to install the fingerprint
module is to simply use pip
:
$ pip install fingerprint
Fingerprint officially supports Python >= 3.0.
>>> import fingerprint
>>> fprint = fingerprint.Fingerprint(kgram_len=4, window_len=5, base=10, modulo=1000)
>>> fprint.generate(str="adorunrunrunadorunrun")
>>> fprint.generate(fpath="../CHANGES.txt")
The default values for the parameters are
kgram_len = 50
window_len = 100
base = 101
modulo = sys.maxint