• tomcoffing

Advanced UNION Techniques Using Yellowbrick


This blog comes from the Yellowbrick Architecture and SQL book by Tom Coffing.


Are you ready to learn Yellowbrick SQL from the best technical trainer the world has ever seen? Tom Coffing, AKA Tera-Tom, makes learning fun, exciting, and easy. Learning SQL is one of the best moves you can make in your career because you can query every database with SQL.


Tom became famous as a teacher because he memorized every student's name when he greeted them. Tom has taught over 1,000 classes and never missed a single name. Ask anyone who attended a class taught by Tom Coffing.


Tom has also led a team of developers for almost 20 years to create the Nexus, which queries, migrates, and joins data across all systems. Amazingly, Nexus migrates between all the databases below with the click of the mouse.


• Yellowbrick

• Excel

• Microsoft Access

• Snowflake

• Redshift

• Synapse

• BigQuery

• Teradata

• Oracle

• SQL Server

• DB2

• Hadoop

• MapR

• Greenplum

• Postgres

• Vertica

• Netezza

• SAP HANA

• MySQL

• SQLite


Advanced UNION for Amazing Totals


Few people know about or have ever seen this brilliant technique using UNION, but prepare to be amazed.


Set operators such as UNION, UNION ALL, INTERCEPT, and EXCEPT have rules. You can combine multiple queries with a set operator in between, but each query must have the same number of columns and be the same data type. The top query is used for aliasing, and the bottom query for sorting using the ORDER BY statement.


Most SQL developers deal a lot with aggregation and understand that you must use a GROUP BY when involving aggregation with normal columns. However, in the query below, we combine UNION with nulls to produce three different reports in one.


In the query below, we have three different totals:

  • PRODUCT_ID

  • MTH

  • YEAR

Check out the answer set and focus on what I will tell you. First, notice that we made $862,404.35 in the year 2000. Now, focus on that we made $418,769.36 in September and $443,634.99 in October, but if you total them up, it equals $862,404.35, which is our yearly total.


Finally, focus on what we made per PRODUCT_ID and understand that if you total PRODUCT_ID 1000, 2000, and 3000, it equals $862,404.35, which is our yearly total.


Combining UNION with nulls gives you a single report with multiple ways to view the amounts. Since we have three groupings, we have three queries separated by a UNION, but the trick is to SUM(DAILY_SALES) but has only one column value per query, with the rest given a null value. Notice that PRODUCT_ID is the only non-null value in the top query, but then EXTRACT(YEAR FROM SALE_DATE) is the only non-null value in the second query, and EXTRACT(MONTH FROM SALE_DATE) is the only non-null value in the third query.



In the next query below, we use five unions to produce these totals:

  • YEAR

  • PRODUCT

  • SALES_PERSON

  • REGION

  • QUARTER

First, notice line one in the answer set, which tells us that in 2020, our sum is $25,875,689.91 for the year. Now, focus on that our salespeople sell four products: Boots, Jackets, Jeans, and T-Shirts, and what we made for each product combined in 2020 totals $25,875,689.91.


Next, focus on our four salespeople: Gary Lewis, Helen Smith, Mary Jones, and Will Davis, and what they sold in 2020. Combine their total sales, and the total is $25,875,689.91.


Next, we see what the Regions: North and South sold during 2020, totaling $25,875,689.91.


Finally, we see what we sold in 2020 per quarter, totaling $25,875,689.91.


This example is from Nexus in dark mode, where the user has chosen to put the answer set to the right of the query window (book mode).


It is never too early to teach your children about data. Do what this family does and read them a story of the greatest data scientist ever!






10 views0 comments