Searchable Encryption through Dispersion
Institute of Electrical and Electronics Engineers (IEEE)
IEEE Latin America Transactions
Cryptography is the universal tool to protect the privacy of data. Today's cryptography still requires encrypted data to be decrypted before it can be searched. We propose here an alternative way of protecting the privacy of data through dispersion of a compressed version of the original data that can be searched without recovering the original data. Our scheme compresses the original data and then generates several chunks that are stored at different nodes. The chunks are stored in the form of an index. To search for a string, we convert the string into chunks with the same scheme and then have each site consult its index in order to obtain a list of all possible positions where the search string might be found. These local results are then sent to the user who performs a logical intersection to find all likely positions in the original, where the search string might be located in the text. The user can then decrypt only those parts or records to obtain all parts or records where the search strings. Our scheme has no false negatives (all occurrences of the search string will be found). We show that the precision becomes close to 100% for longer strings using a corpus consisting of texts in the English language. We also show that the chunks are somewhat, but not quite similar to random bit streams, and that each individual stream has less information content than a typical English language stream of the same length.