Often when working with collections of data, you may want to find the smallest or largest item. It's easy enough to write a function that iterates through the items and returns the smallest or largest one, or use the builtin min()
, max()
, or sorted()
functions. Another interesting way may be implementing a heap (priority) queue.
Python provides a pretty convenient module called heapq
that does that for you. heapq
comes with a cool set of inbuilt functions that you can read more about in the docs. For this short post, I just want to show how you can easily find the smallest and largest items in a collection
Finding the smallest items
Find 3 of the smallest items using the nsmallest
function:
import heapq
numbers = [9, 45, 21, 4, 63, 3, 109]
print(heapq.nsmallest(3, numbers)) # [3, 4, 9]
Also, converting a list into a heap using the heapify
function automatically sets the smallest item as the first:
import heapq
numbers = [9, 45, 21, 4, 63, 3, 109]
heapq.heapify(numbers)
print(numbers)
# Output: [3, 4, 9, 45, 63, 21, 109]
You can then pop from the heap with heappop()
:
heapq.heappop(numbers) # 3
print(numbers)
# Output: [4, 45, 9, 109, 63, 21]
heapq.heappop(numbers) # 4
print(numbers)
# Output: [9, 45, 21, 109, 63]
Finding the largest items
Find 3 of the largest items using the nlargest
function:
import heapq
numbers = [9, 45, 21, 4, 63, 3, 109]
print(heapq.nlargest(3, numbers)) # [109, 63, 45]
A more practical example
import heapq
people = [
{'fname': 'John', 'lname': 'Doe', 'age': 30},
{'fname': 'Jane', 'lname': 'Doe', 'age': 25},
{'fname': 'Janie', 'lname': 'Doe', 'age': 10},
{'fname': 'Jane', 'lname': 'Roe', 'age': 22},
{'fname': 'Johnny', 'lname': 'Doe', 'age': 12},
{'fname': 'John', 'lname': 'Roe', 'age': 45}
]
oldest = heapq.nlargest(2, people, key=lambda s: s['age'])
print(oldest)
# Output: [{'fname': 'John', 'lname': 'Roe', 'age': 45}, {'fname': 'John', 'lname': 'Doe', 'age': 30}]
youngest = heapq.nsmallest(2, people, key=lambda s: s['age'])
print(youngest)
# Output: [{'fname': 'Janie', 'lname': 'Doe', 'age': 10}, {'fname': 'Johnny', 'lname': 'Doe', 'age': 12}]
It should be noted though that the nsmallest(n, iterable)
and nlargest(n, iterable)
functions perform best for smaller values of n
. For larger values, it is more efficient to use the sorted()
function. Also, when n==1
, it is more efficient to use the builtin min()
and max()
functions.