I need to have a hash function in matlab which has capability to define its generated hashing length. For example, MD5 can generated hashing with length of 128 bits. However, I need to define various hashing function with designed lengths such as 10,16,20, ...
I've heard MD5 hashes can be used to compare contents of eg a file. The
MimePart class in the JavaMail library also contains a
setContentMD5() method, but I couldn't find an example for using it. Can I use it to compare email content using the hashes (and verify there was no loss of data during the download)? Of which part should I then generate the MD5 hash?
the getContentMD5() doesn't work when I use IMAP (although the header is actually present).
My understanding is that SSL combines an encryption algorithm (like AES, DES, etc.) with a key exchange method (like Diffier-Hellman) to provide secure encryption and identification services between two endpoints on an un-secure network (like the Internet).
My understanding is that SASL is an MD5/Kerberos protocol that pretty much does the same thing.
So my question: what are the pros/cons to choosing both and what scenarios make either more preferable? Basically, I'm looking for some guidelines to follow when choosing SSL or to go with SASL instead. Thanks in advance!
I'm developing a back-end application for a search system. The search system copies files to a temporary directory and gives them random names. Then it passes the temporary files' names to my application. My application must process each file within a limited period of time, otherwise it is shut down - that's a watchdog-like security measure. Processing files is likely to take long so I need to design the application capable of handling this scenario. If my application gets shut down next time the search system wants to index the same file it will likely give it a different temporary name.
The obvious solution is to provide an intermediate layer between the search system and the backend. It will queue the request to the backend and wait for the result to arrive. If the request times out in the intermediate layer - no problem, the backend will continue working, only the intermediate layer is restarted and it can retrieve the result from the backend when the request is later repeated by the search system.
The problem is how to identify the files. Their names change randomly. I intend to use a hash function like MD5 to hash the file contents. I'm well aware of the birthday paradox and used an estimation from the linked article to compute the probability. If I assume I have no more than 100 000 files the probability of two files having the same MD5 (128 bit) is about 1,47x10-29.
Should I care of such collision probability or just assume that equal hash values mean equal file contents?
Do you know any library which allows me compare two files based on their forensic signature?
I don´t want to know that two files are not equal like MD5 allows me, instead I want to know how similar two files are in a forensic way.
I want to detect similar areas inside the file, even if they are at different locations in both.