Questions
What are window functions and how do you use them in PostgreSQL?
The Scenario
You are a data analyst at an e-commerce company. You are trying to write a query to find the top 10 products by sales in each category.
You could do this with a series of subqueries, but you know that this would be inefficient and difficult to read.
The Challenge
Explain what window functions are in PostgreSQL and how you would use them to solve this problem. What are the key benefits of using window functions?
A junior engineer might not be aware of window functions. They might try to solve this problem with a series of subqueries, which would be inefficient and difficult to read.
A senior engineer would know that window functions are the perfect tool for this job. They would be able to explain what window functions are and how to use them to write concise and efficient queries for a variety of different use cases.
Step 1: Understand What Window Functions Are
A window function is a type of function that performs a calculation across a set of table rows that are somehow related to the current row.
Step 2: The Key Window Functions
| Function | Description |
|---|---|
ROW_NUMBER() | Assigns a unique number to each row in the partition. |
RANK() | Assigns a rank to each row in the partition, with gaps for ties. |
DENSE_RANK() | Assigns a rank to each row in the partition, without gaps for ties. |
LEAD() | Returns the value of a column from the next row in the partition. |
LAG() | Returns the value of a column from the previous row in the partition. |
Step 3: Solve the Problem
Here’s how we can use a window function to find the top 10 products by sales in each category:
SELECT product_name, category_name, sales, rank
FROM (
SELECT
product_name,
category_name,
sales,
RANK() OVER (PARTITION BY category_name ORDER BY sales DESC) as rank
FROM products
) as ranked_products
WHERE rank <= 10;In this example, we use the RANK() window function to assign a rank to each product based on its sales, within each category. We then filter the results to only include the products with a rank of 10 or less.
Practice Question
You want to find the second highest salary in each department. Which of the following would you use?