Find duplicates using hash of pdf files

 I'd like for duplicate pdf files to be found via their hash (e.g. MD5). Also, if you can get hashes for all references and build a hash:doi database, then you can help identify the citation by the hash.

