Questions
What are bitmaps and hyperloglogs and what are they used for in Redis?
The Scenario
You are a backend engineer at a social media company. You are building a new service that needs to track a large number of events, such as which users have liked a post or which users have seen an ad.
You need to find a way to store this data in a way that is efficient and that does not use a lot of memory.
The Challenge
Explain what bitmaps and hyperloglogs are in Redis and how you would use them to solve this problem. What are the key benefits of using these data structures?
A junior engineer might try to solve this problem by using a set. This would work, but it would use a lot of memory for a large number of events. They might not be aware of bitmaps and hyperloglogs, which are much more memory-efficient.
A senior engineer would know that bitmaps and hyperloglogs are the perfect tools for this job. They would be able to explain what these data structures are and how to use them to track a large number of events in a memory-efficient way.
Step 1: Understand What Bitmaps and HyperLogLogs Are
| Data Structure | Description | Use Cases |
|---|---|---|
| Bitmaps | A bitmap is a data structure that allows you to store a boolean value for each element in a set. | Tracking user activity, such as which users have liked a post or which users have seen an ad. |
| HyperLogLogs | A HyperLogLog is a probabilistic data structure that is used to estimate the cardinality of a set. | Counting the number of unique visitors to a website or the number of unique searches for a keyword. |
Step 2: Choose the Right Tool for the Job
- If you need to know whether a specific user has performed an action, you should use a bitmap.
- If you only need to know the number of unique users who have performed an action, you should use a HyperLogLog.
Step 3: Code Examples
Here are some code examples that show how to use these data structures:
Bitmaps:
# Mark user 123 as having seen the ad
SETBIT ad:1:seen 123 1
# Check if user 123 has seen the ad
GETBIT ad:1:seen 123
# Count the number of users who have seen the ad
BITCOUNT ad:1:seenHyperLogLogs:
# Add a user to the set of unique visitors
PFADD unique_visitors "user:123"
# Get the number of unique visitors
PFCOUNT unique_visitors Practice Question
You need to count the number of unique visitors to your website. Which of the following would be the most appropriate?