Snowflake Unstructured Data Part 2
This blog post comes from the Snowflake Architecture and SQL book written by Tom Coffing and David Cook.
Here is a Snowflake JSON technique to CREATE a table and INSERT data. The example below represents JSON data. Notice the key pair values of the column name VIN and the value of CYZWBRA0EG121253. Also, notice the key pair of Mileage and its value of 75899. The column Time has a value, but there are two additional columns of TPMS and EPS under the column CHASSIS with their corresponding values, which is why we have the curly brackets, which show multiple column values with a column value. For example, there are five columns under the column ELECTRICAL and two columns under the column SAFETY.
Below is an example of two techniques to query Snowflake JSON tables. We are using the column name and the $1 technique. The $1 represents column one in our table. Notice in our query result that the CAR:CHASSIS column has three dots. The three dots guide you to hover to get the information in the box below. The reason is that CHASSIS has multiple components, including EPS and TPMS.
Below is an example of a JSON query that formats the data for better readability.
Snowflake offers built-in functions and SQL extensions for traversing, flattening, and nesting semi-structured data, supporting popular formats such as JSON and Avro. In addition, automatic schema discovery and columnar storage make operations on schema-less, semi-structured data nearly as fast as over plain relational data, without any user effort.
Below, we use a WHERE clause to retrieve specific data.
You can parse an array using the FLATTEN function. FLATTEN is a table function that produces a lateral view of a VARIANT, OBJECT, or ARRAY column. FLATTEN returns a row for each object, and the LATERAL modifier joins the data with any information outside the object. For example, below, we are flattening and bringing back the names and addresses of all customers. Notice that we cast the VARIANT output to string values.
Below, we add a second FLATTEN clause to flatten the extras array within the flattened vehicle array and retrieve the “extras” purchased for each car sold.
You can extract a value from a VARIANT column using the GET_PATH function. The function is a variation of GET, used to extract a value using a pathname.
Part three of our Snowflake unstructured data will come tomorrow. Understanding unstructured data gives you a big advantage because you are witnessing the future of data. At first, it is difficult to understand, but soon you will have it down easily, and your co-workers will be amazed at your skill.
If you like how Tom Coffing presents and teaches, you will be amazed at his Nexus Server software for migrations to Snowflake. Check out the best way to automate the movement from all databases to Snowflake here.
Tom Coffing (wrestler on the left) won two Olympic Regional Trials in 1980.
Your company can hire Tom Coffing to teach classes on Snowflake. All classes are customized, but Tom is currently doing a 5-day Snowflake boot camp that teaches the Snowflake architecture, tricks and tips, and every SQL command. Check out the agenda here. https://www.nexusdataserver.com/snowflakeclass.
The Snowflake query browser is light years behind the 18 years of development of the Nexus Query Chameleon. And the best part is that Nexus can query all systems in your enterprise and has a Super Join Builder that builds the SQL automatically as the user's point-and-click. And the most amazing part is that Nexus can easily join Snowflake tables with Excel, Access, and tables and views from all other database platforms. Check out just some of the amazing features of Nexus here.