QUALIFY in SQL
Last year, while starting to use Snowflake, I discovered the QUALIFY
keyword in SQL.
QUALIFY
filters results based on a window function. It’s not part of the SQL standard and so won’t be supported in all SQL engines, but I think it has good coverage in some of the newer ones.
Let’s say we have an employees
table, courtesy of ChatGPT:
┌────┬─────────┬─────────────┬────────┐id ┆ name ┆ department ┆ salary │
│
╞════╪═════════╪═════════════╪════════╡1 ┆ Alice ┆ Engineering ┆ 70000 │
│ 2 ┆ Bob ┆ Engineering ┆ 72000 │
│ 3 ┆ Charlie ┆ HR ┆ 65000 │
│ 4 ┆ Diana ┆ HR ┆ 68000 │
│ 5 ┆ Eve ┆ Engineering ┆ 71000 │
│ └────┴─────────┴─────────────┴────────┘
We want to find the top earner for each department. With QUALIFY
, the query becomes concise:
SELECT *
FROM employees
RANK() OVER (PARTITION BY department ORDER BY salary DESC) = 1; QUALIFY
Here, RANK()
ranks employees within each department by salary, and QUALIFY
filters the results to show only the top earner in each department.
┌────┬───────┬─────────────┬────────┐id ┆ name ┆ department ┆ salary │
│
╞════╪═══════╪═════════════╪════════╡2 ┆ Bob ┆ Engineering ┆ 72000 │
│ 4 ┆ Diana ┆ HR ┆ 68000 │
│ └────┴───────┴─────────────┴────────┘
Previously when doing something like this, I would use a common table expression (CTE):
WITH ranked_employees AS (
SELECT
*,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS employee_rank
FROM employees
)
SELECT *
FROM ranked_employees
WHERE employee_rank = 1;
As a small query, I still think using a CTE is fine, but when using multiple CTEs in a larger or more complex query, the brevity of QUALIFY
is really nice.
Want to try it out? Check out the DuckDB web shell.