Snowflake Ranks at Top of Rank Function
All information from this blog comes from the Snowflake Architecture and SQL book by Tom Coffing and David Cook.
Yes, the Snowflake Architecture and SQL book is on its way to your favorite book stores, such as Skillsoft, where you can read it online, or Amazon, where you can buy it. It is another masterpiece of information from Tom Coffing and David Cook, two of the most famous authors in the history of computers. You can purchase your PDF version of the book, which is in color, has hundreds of wonderful examples, covers every single SQL command, and will be your forever guide to mastering Snowflake. The book is only $199.00 and comes with your name watermarked on each page.
This blog discusses the RANK command. In each example, you will see an ORDER BY statement, but it will not come at the end of the query. The ORDER BY keywords is always within the RANK calculation. It is the ORDER BY statement that determines what we are ranking.
In its most simple explanation, a RANK will sort the data first via the ORDER BY statement and then give the first row rank of 1. It will rank the second row with a two unless the values of rows one and two are equal. Check out the example below.
The query above uses the Nexus Query Chameleon from Coffing Data Warehousing.
In the picture above, notice that an open and close parenthesis immediately follows the keyword RANK. The open and close parenthesis insinuates a function. There is never anything inside the parenthesis, but it is required. The keyword OVER follows, representing that this analytic is an ordered analytic, which is interchangeable with the term window function. The term ordered analytic means the data set will be put in a specific order before the calculation begins. We sort the data using the ORDER BY statement, and in this example, we are sorting by the Daily_Sales column. Since the default for an ORDER BY statement is ascending, we are ranking the data by Daily_Sales ASC.
When you first see a RANK command with nothing in the parenthesis, you might have trouble at first and think, “What are we ranking?” Check out the column in the ORDER BY statement, which is what you rank. We have an ORDER BY Daily_Sales, so we are ranking by Daily_Sales.
The first two rows have a value of 32,800.50, so they both get a rank of 1. Rows one and two are tied, but notice that row three gets a rank of 3. There are no more ties after that, so each row gets the next sequential ranking.
A user often wants to give the highest value, the number one rank. Check out the example below because we will use the ORDER BY to rank the Daily_Sales column in DESC order.
The phrase below sounds like a protest chant:
What are we ranking? Daily_Sales!
How are we ranking it? Descending order!
The query above uses the Nexus Query Chameleon in blue mode from Coffing Data Warehousing.
Each RANK example will have an ORDER BY statement, but sometimes you will also have a PARTITION statement. In the example below, you see the keywords PARTITION BY, meaning the RANK function will reset and start over. Our ORDER BY statement is ordering the data by the column Daily_Sales DESC, so Daily_Sales is what we are ranking, but we will reset the rank calculation and start over with each Product_ID break because the column Product_ID is in the PARTITION BY statement. Check out the next example below.
The query above uses the Nexus Query Chameleon in dark mode from Coffing Data Warehousing.
Get ready for something you have never seen before, which is the QUALIFY statement. QUALIFY acts like a filter but differs from a WHERE clause filter. The QUALIFY statement waits until all of the analytic calculations finish. When the report finishes and is ready to return, the QUALIFY steps in and filters the rows further.
In the example below, we are attempting to find the top three Daily_Sales per Product_ID. It is the QUALIFY statement that dictates the top three Daily_Sales.
Let’s review. We rank by Daily_Sales DESC because of the ORDER BY Daily_Sales DESC statement. We reset the calculation on each Product_ID break because of the PARTITION BY Product_ID statement. And finally, after the entire ranking occurs, we further filter to get only the top three Daily_Sales per Product_ID because of the Qualify statement. Notice we have given our RANK analytic an alias name for the report, which we call RANK1. So, we QUALIFY RANK1 < 4, giving me the first three rows.
The query above uses the Nexus Query Chameleon from Coffing Data Warehousing with bookend answer set mode.
Snowflake and Teradata are the only two systems I have ever seen that use the QUALIFY statements. If you are working with a system that does not support QUALIFY, you can place your SQL statement in a derived table and then use a WHERE clause.
In the example below, all of the SQL using the colors blue and red is part of the derived table. The results are the same as the previous example, but it is done with a derived table and not a QUALIFY statement.
Get Your Analytics for Free
Snowflake cleverly only charges you per query, which can save company money. However, the best compliment for Analytics with Snowflake is Analytics on Nexus. Imagine if you need to get dozens of analytic reports on a specific data set. You might have to run different analytic queries on Snowflake to get your reports, but one of the best concepts of Nexus is the Garden of Analytics.
The Nexus saves every answer set you run during the day into the Garden of Analytics, which uses your PC's memory. Using drag and drop templates, anyone can get over 80 different analytic reports, graphs, charts, pivots, or spreadsheets, and Nexus does all of the calculations for free inside your PC. If you need 80 analytic reports, then Nexus saves you 80 queries you would have to pay Snowflake.
Check out the Nexus Garden of Analytics in this YouTube video and save lots of money.
Why Write Snowflake SQL When Nexus Writes It For You?
Nexus users can use the Super Join Builder, which shows you tables and views, connects you to which tables join, and writes the SQL automatically. If you want to see the greatest automated join builder the world has ever seen, check out this YouTube Video. And yes, the Nexus Super Join Builder joins data from Snowflake with Excel, Microsoft Access, Teradata, Oracle, SQL Server, DB2, Postgres, MySQL, Amazon Redshift, and Azure Synapse, Google BigQuery, or other Snowflake warehouses.