Questions
What is the aggregation framework and how do you use it in MongoDB?
The Scenario
You are a data analyst at an e-commerce company. You are trying to write a query to find the total sales for each category.
You could do this by fetching all the data from the database and then processing it in your application, but you know that this would be inefficient.
The Challenge
Explain what the aggregation framework is in MongoDB and how you would use it to solve this problem. What are the key stages of an aggregation pipeline?
A junior engineer might not be aware of the aggregation framework. They might try to solve this problem by fetching all the data from the database and then processing it in their application, which would be inefficient.
A senior engineer would know that the aggregation framework is the perfect tool for this job. They would be able to explain what the aggregation framework is and how to use it to write concise and efficient queries for a variety of different use cases.
Step 1: Understand What the Aggregation Framework Is
The aggregation framework is a tool for performing data analysis on a collection of documents. It works by processing the documents through a pipeline of stages.
Step 2: The Key Stages of an Aggregation Pipeline
| Stage | Description |
|---|---|
$match | Filters the documents to only include the ones that match a given criteria. |
$group | Groups the documents by a given key and performs an aggregation function on each group. |
$sort | Sorts the documents by a given key. |
$project | Reshapes the documents by adding, removing, or renaming fields. |
$limit | Limits the number of documents that are passed to the next stage. |
$skip | Skips a specified number of documents. |
Step 3: Solve the Problem
Here’s how we can use the aggregation framework to find the total sales for each category:
db.products.aggregate([
{
$group: {
_id: "$category",
total_sales: { $sum: "$sales" }
}
},
{
$sort: {
total_sales: -1
}
}
])In this example, we use the $group stage to group the products by category and to calculate the total sales for each category. We then use the $sort stage to sort the results by total sales in descending order.
Practice Question
You want to find the average price of all the products in each category. Which of the following would you use?