How to Perform Union In Snowflake (Disparate Tables with Diverse Structures)
- Ndz Anthony
- May 17, 2023
There may be times when you need to merge data from different tables with diverse structures or schemas. For instance, two tables might store customer information, but one has more columns than the other, or their columns have varying names or data types. How can you union in snowflake tables to obtain a consistent outcome?
This article will dive into some super data transformation techniques to help you union tables with unique structures. And if that wasn’t cool enough, we’re also going to introduce you to Datameer — a super powerful data transformation tool that makes unioning disparate tables a walk in the park. We’ll be using snowflake as our database platform, but these concepts and syntax apply to other SQL dialects too.
What is a Union?
A union in SQL combines rows from two or more tables or queries into a single result set, eliminating duplicate rows. The syntax for a union is:
SELECT column_list FROM table1 UNION
SELECT column_list FROM table2
Both tables or queries must have the same number and order of columns in their column_list, and the corresponding columns’ data types must be compatible or convertible.
Imagine two tables storing customer information: customers and clients. The customer’s table has five columns: id, name, email, phone, and country. The clients table has five columns: id, name, address, country, and customer_id.
We can union these tables on Country with the following query:
SELECT country FROM customers
The result set will look something like this:
Pretty basic stuff yeah?
How to Union Snowflake Tables (Disparate Tables with Diverse Structures)?
The previous example illustrated how to union two tables with some common columns and some different columns. But what if the tables have entirely different structures or schemas?
For example, what if we have another table called orders that stores order information for customers and clients? The orders table has five columns: order_id, customer_id, product_id, quantity, and price. How can we union this table with the customers and clients tables?
—Complex Union Puzzle time 🤖—
Problem Statement : Suppose you have three tables in a database — “customers,” “clients,” and “orders” — each with different column structures. The “customers” table has columns “id”, “name”, “email”, and “phone”. The “clients” table has columns “id”, “name”, and “address”. The “orders” table has columns “order_id”, “customer_id”, “product_id”, “quantity”, and “price”.
Using SQL, how would you combine the data from these tables into a single result set that includes all the columns in a uniform structure? Additionally, how would you use common table expressions (CTEs) and union queries in the SQL query to achieve this?
— High-level Approach — 💨
One approach is to use a common table expression (CTE) to create a temporary table with a consistent structure for all tables we want to union. A CTE is a named subquery that can be referenced within another query. The syntax for a CTE is:
WITH cte_name (column_list) AS ( subquery)
SELECT * FROM cte_name;
The column_list specifies the names and order of the columns for the CTE, while the subquery defines the data source for the CTE.
Let’s create a temporary table called customer_info with four columns: id, name, info_type, and info_value using a CTE. The info_type column indicates the kind of information stored in the info_value column (e.g., ’email’, ‘phone’, ‘address’, or ‘order’).
The info_value column contains the actual value of the information.
We can populate this temporary table with data from the customers, clients, and orders tables using union queries.
WITH customer_info (id, name, info_type, info_value) AS (
— Select email and phone information from customers table
SELECT id, name, ‘email’ as info_type, email as info_value FROM customers
SELECT id, name, ‘phone’ as info_type, phone as info_value FROM customers
— Select address information from clients table
SELECT id, name, ‘address’ as info_type, address as info_value FROM clients
— Select order information from orders table
SELECT c.id, c.name, ‘order’ as info_type, CONCAT(o.order_id, ‘:’, o.product_id, ‘:’, o.quantity, ‘:’, o.price) as info_value FROM orders o JOIN (SELECT id, name FROM customers UNION SELECT id, name FROM clients) c ON o.customer_id = c.id
— Select all data from customer_info table
SELECT * FROM customer_info;
Notice that we have to use a subquery to join the orders table with the union of the customers and clients tables. This is because the orders table has a foreign key to both tables, and we need to get the name of the customer or client for each order.
We also have to use the CONCAT function to combine the order information into a single string value.
Using a CTE and union queries, we can create a temporary table that has a uniform structure for all the tables we want to union.
This way, we can easily query or manipulate the data in the temporary table without worrying about the differences in the original tables.
The Easiest Way to Union Disparate Tables with Diverse Structures
Okay, so you’ve got some SQL tricks up your sleeve, but what if you want an even simpler way to union tables with diverse structures? That’s where Datameer shines. Datameer is a user-friendly, code-free platform that helps you transform data in Snowflake. It’s perfect for handling situations like this one.
Let’s break down how Datameer can make the process of unioning disparate tables with diverse structures a breeze:
- Hook up Datameer to your Snowflake instance and import those tables you want to union.
- Use the snazzy visual interface to map columns between the tables, taking care of any differences in column names, data types, and structures.
- Decide how you want to handle missing or additional columns, using Datameer’s awesome data transformation functions.
- Check out the results to make sure the union is spot-on and tweak anything that needs adjusting.
- Send the unioned data back to Snowflake or use Datameer’s built-in visualization tools to analyze the results.
Watch and learn more about unioning in our Datameer demo.
Datameer’s easy-to-use interface and powerful data transformation features let you union disparate tables with diverse structures in no time, without the headache of writing complex SQL queries.