Questions
What is the difference between a `dict` and a `defaultdict` in Python?
The Scenario
You are a backend engineer at a social media company. You are writing a new service that needs to count the number of times each word appears in a large text file.
You are considering using either a dict or a defaultdict to store the word counts.
The Challenge
Explain the difference between a dict and a defaultdict in Python. What are the pros and cons of each approach, and which one would you choose for this use case?
A junior engineer might just use a `dict` and write a lot of boilerplate code to check if a key exists before incrementing its value. They might not be aware of the `defaultdict` class, which is a much more elegant solution.
A senior engineer would know that a `defaultdict` is the perfect tool for this job. They would be able to explain the difference between a `dict` and a `defaultdict`, and they would be able to write a concise and elegant solution using a `defaultdict`.
Step 1: Understand the Key Differences
| Feature | dict | defaultdict |
|---|---|---|
| Missing Keys | Raises a KeyError if you try to access a key that does not exist. | Returns a default value if you try to access a key that does not exist. |
| Syntax | {} | defaultdict(default_factory) |
| Use Cases | When you want to handle missing keys explicitly. | When you want to provide a default value for missing keys. |
Step 2: Choose the Right Tool for the Job
For our use case, a defaultdict is the best choice. It allows us to provide a default value of 0 for missing keys, which makes it easy to increment the count for a new word.
Step 3: Code Examples
Here are some code examples that show the difference between the two approaches:
dict:
word_counts = {}
for word in words:
if word in word_counts:
word_counts[word] += 1
else:
word_counts[word] = 1defaultdict:
from collections import defaultdict
word_counts = defaultdict(int)
for word in words:
word_counts[word] += 1As you can see, the defaultdict solution is much more concise and elegant.
Practice Question
You want to create a dictionary that groups a list of words by their first letter. Which of the following would be the most elegant way to do this?