• tomcoffing

I am not the World's Greatest Data Scientist



For the record, I am not the world's greatest data scientist.


I am 63 years young, but I will celebrate 50 years of working in the computer industry in three years. People continue to tell me I am the greatest data scientist the world has ever known, but it is not true.


I only appear to be the world's greatest scientist because I wrote the most books, taught the most classes, and used the world's greatest software, Nexus, which I invented with the help of some of the world's greatest scientists.


Most people know me because I have written 85 books on 20 database platforms. I write about the database architecture and the SQL associated with it. Each book usually includes how things work inside the database engine so users can create the tables and indexes and how they can performance tune the SQL. My latest book on Snowflake is one of my best.


Others know me because I trained them. I have taught over 1,000 classes to the largest companies in China, Africa, India, Malaysia, Mexico, Canada, Europe, and the United States. I have hundreds of videos on YouTube and my website.


People love my books because I make learning difficult subjects so easy. For example, the first system I became an expert on was Teradata, and soon people began to call me Tera-Tom. It was so difficult to learn Teradata, so once I had ten years of experience, I vowed to explain technology so everyone could appreciate and understand it on all systems.


I love to write books, do videos, and teach classes because I know how incredibly lucky it is to work in a data warehouse. I didn't think I would make it because data warehouse technology was so hard for me to learn but giving people the gift of knowledge to ensure they make it is my joy in life.


My joy in life over the past 18 years has been to help everyone working with data dominate through software. I started in 1994 with the idea of simply building a great query tool that would work on all systems, so I decided to call it Nexus, the intersection between all databases.


But soon, I realized that some of the most important people in a company who needed to analyze analytics to make business decisions were not SQL savvy, so we built the Nexus Super Join Builder. Users can see tables and views visually, how they join, and merely point-and-click on the columns and analytics they need and let Nexus build the SQL automatically.


Then I realized that migrating data was the most difficult challenge for any savvy data scientist. Systems refuse to talk to other systems because the table structures, data types, and dates differ. In addition, each system uses a different load scripting language to import and export data.


So we built the Nexus migration piece so anyone and everyone could point and click on the source and target systems and move thousands of tables automatically. And to speed things up for tables with hundreds of millions to billions of rows, we added a Nexus Server to handle the big data.


But in my heart, I always knew that there would be so much data coming from so many places that, eventually, companies would need dozens, if not hundreds, of database platforms to store and analyze their data.


So my team of data scientists puts together the Nexus Super Join Builder with the Nexus migration application so users could automatically join data across platforms.


Every data, we break another world record with Nexus. For example, we migrated data between 400 different systems in a single day last month.


In another example, I did a 20-table join across 20 different systems in a single query yesterday and processed the join across 22 systems, including my PC and the Nexus Server. Yes, Nexus allows the user to decide where to process the join, which we call the hub. So, for example, if you want the join to process on Snowflake, then Nexus moves the other 19 tables from their respective system to Snowflake. If you change the hub to SQL Server, Nexus automatically changes the SQL and load scripts and moves the other 19 tables to SQL Server. And finally, if you choose to make the hub your PC, Nexus moves all twenty tables to your PC and, in the background, joins the data using your PC's memory and CPU. You can change the hub to any system, and Nexus provides graphs and charts to help you decide the best hub based on table sizes and columns needed.


My team has spent 18 years so that everyone can do everything in about 18 seconds.


The systems included in our migration and federated query record include the following:


  • Yellowbrick

  • Teradata

  • Oracle

  • SQL Server

  • DB2

  • Greenplum

  • Netezza

  • MySQL

  • Postgres

  • SAP HANA

  • Amazon Redshift

  • Snowflake

  • Azure Synapse

  • Athena

  • Microsoft Access

  • Excel

  • SQLite

  • Hadoop

  • Google BigQuery

  • Vertica


Here are two videos you will want to see. The first will do a 20-table join across 20 systems, and the second will migrate 19 systems to Snowflake.


I am not the world's greatest data scientist. However, Nexus is the world's greatest software.


Thank you to the thousands of customers using Nexus for all their help, feedback, and ideas.


Here is the video of a 20-table join across 20 different systems in a single query built by Nexus.



Here is the video of 20 different systems you can migrate to Snowflake. Some customers are migrating thousands of tables to Snowflake in a single job. However, you can migrate 20 systems to Vertica, and 20 to Yellowbrick, Postgres, BigQuery, etc.









23 views0 comments

Recent Posts

See All