The book "Probabilistic Data Structures and Algorithms in Big Data Applications" is now available at Amazon and from local bookstores. More details at https://pdsa.gakhov.com
In this presentation, I described popular algorithms that employed Locality Sensitive Hashing (LSH) to solve similarity-related problems. I started with LSH in general and then switched to such algorithms as MinHash (LSH for Jaccard similarity) and SimHash (LSH for cosine similarity). Each approach came with some math that was behind it and simple examples to clarify the theory statements.