This blog post comes from the book Snowflake Architecture and SQL by Tom Coffing and David Cook.
Snowflake is one of the only databases with the QUALIFY statement for its analytic/window functions. QUALIFY is to analytics what HAVING is to aggregates. In other words, the brilliance behind QUALIFY is it is essentially a WHERE clause once the calculations finish.
I will show a few examples before showing the QUALIFY command, which will give you the context for its power and beauty.
In our example below, we have an ordered analytic called RANK. Notice there is nothing in the parenthesis after the word RANK, so how do you know what column we rank? These functions are called ordered analytics because they order the data before the analytic calculates. Therefore, since we ORDER BY DAILY_SALES DESC, we rank DAILY_SALES with the highest daily_sales value receiving a rank of one.
Also, notice that we have our rank amongst PRODUCT_ID 1000 and 2000, which we will change in our second example.
Our next example below adds a PARTITION BY PRODUCT_ID statement to our RANK analytic, which tells Snowflake to rank within the PRODUCT_ID. In other words, rank the product_id 1000 DAILY_SALES and then rank the product_id 2000 DAILY_SALES within their groups (partitions).
Notice that the rank calculations reset with each PRODUCT_ID break. Also, notice there is no comma between the PARTITION BY PRODUCT_ID statement and the ORDER BY DAILY_SALES DESC, and both are within the parentheses.
It is now time to introduce the QUALIFY statement. Notice in our example below we have a WHERE statement, which commands Snowflake to only calculate rows for PRODUCT_ID 1000 and 2000. A WHERE clause brings back rows based on a value (PRODUCT_ID in this case) within the existing table.
You can't use a WHERE clause to eliminate the RANK1 value because they don't exist within the table and are calculations. All WHERE clauses eliminate rows before any calculations.
The QUALIFY statement eliminates rows after the calculations finish. The QUALIFY statement is for ordered analytics and window functions, while the HAVING statement is for aggregates. Both QUALIFY and HAVING are essentially WHERE clauses after calculations happen.
Our example below uses QUALIFY to bring back the three DAILY_SALES with the highest DAILY_SALES value in their ranks within their PRODUCT_ID.
If you like how Tom Coffing presents and teaches, you will be amazed at his Nexus Server software for migrations to Snowflake. Check out the best way to automate the movement from all databases to Snowflake here.
Your company can hire Tom Coffing to teach classes on Snowflake. All classes are customized, but Tom is currently doing a 5-day Snowflake boot camp that teaches the Snowflake architecture, tricks and tips, and every SQL command. Check out the agenda here. https://www.nexusdataserver.com/snowflakeclass.
The Snowflake query browser is light years behind the 18 years of development of the Nexus Query Chameleon. And the best part is that Nexus can query all systems in your enterprise and has a Super Join Builder that builds the SQL automatically as the user's point-and-click. And the most amazing part is that Nexus can easily join Snowflake tables with Excel, Access, and tables and views from all other database platforms. Check out just some of the amazing features of Nexus here.
Comments