Grokking Algorithms An illustrated guide for programmers and other curious people, by Aditya Y. Bhargava
Chapter 11: Where to go next
If you read this chapter, please go ahead and discuss what you thought in the comments!
📝 This is the last chapter in the book, and is a summary of 10 algorithms for further optional reading. You can read my notes below.
Where to go Next
key topics covered: suggestions
, binary search tree
, The Fourier Transform
, Parallel algorithms
, MapReduce
, Bloom Filters
, HyperLogLog
, SHA
, Diffie-Hellman
, key exchange
, locality-sensitive
, hashing
, linear programming
Binary Search Tree
Good for learning about databases
and data structures
A binary search tree is faster for insertions and deletions than a sorted array in average, because a binary search tree allows you to find or insert in a more granular but sorted way. They don't allow random access, and the good performance times are based on a balanced tree (too many nodes on one side will be imbalanced, and performance will be affected - red-black trees are self-balancing so that might be a good option).
Inverted Indexes
Good for learning about search
This data structur is a hash that maps words to places where they appear, and is commonly used in search engines.
The Fourier Transform
Good for compression
and format
s like MP3 or JPG, and music tech
This algorithm can take a digital music signal & break it into individual frequencies so that the file can be reduced (like in MP3) or frequecies can be boosted or reduced in music production. It also works for images in the same way, and can be used in applications like earthquake prediction and DNA analysis, with those digital signals.
Parallel algorithms
Good for speed optimisation
These allow you to spread work out in parallel so that it can be completed sooner. But they're hard to design, test, and predict the cost benefit of as the time gains aren't linear.
MapReduce
Good for speed and load balancing
This is a popular distributed algorithm (a subtype of parralel algorithm). Distributed algos are useful when you have a lot of work to do as they will cut out a lot of time to do it by spreading work across multiple machines. an example is Apache Hadoop.
Bloom filters
Good for quick approximations
These are probabilistic data structures (they give you a likely correct answer but it could be a false positive). This is an alternative to a hash
(which will give you the correct answer) but for a large hash index might not be practical, and a bloom filter
approximation might be good enough.
HyperLogLog
Good for quick approximations
A similar probabilistic algorithm that approximates the number of unique elements in a set.
SHA algorithms
Good for unbiased comparison
This is a secure hash algorithm (a SHA
is a hash function, which generates a string in this context called a hash
). Unlike has functions which goes from (e.g.) string to array index, SHA goes from string to string. You can use the SHA to tell if objects, files or strings (like passwords) are the same without revealing the original content. [note: this is not the gold-standard for password-hashing]
Localilty-sensitive hashing
Good for biased comparison
SHA is locality insensitive - similar starting strings will not have similar hashes. A locality-sensitive hash such as Simhash
will generate a string that is similarly represented so that you can compare two hashes for similarity of content - e.g. for plagiarism checks.
Diffie-Hellman key exchange
Good for cryptography
The Diffie-Hellman algorithm is used for message encryption. It has a public key for sharing which is used to encrypt a message, and a private key which can decode the message and is 'owned' by one person. It's hard to decode without the private key and the cipher does not need to be communicated so it is still used in practise alongside RSA.
Linear programming
Good for optimization
Graph algorithms in the book are a subset of linear programming, which uses the complex Simplex algorithm.
Discussion for this chapter
- Are there any algorithms you're going to investigate that aren't recommended in this list or in the book?
- Have you used any of these 10 algorithms? Or, did you consider any of these but opted for a different algorithm instead?
Discussion for the entire book
Please visit the discussion questions post to reflect on the book, and add your answers to the comments along wit any of your own questions or reflections you have.