Thanks to the ever-growing nature of formats and sizes of data that you will encounter in any data-driven scenario, it is hard, if not almost impossible, to think of a use case where data comes from a single source. Blending data from multiple sources is necessary to enhance the meaning and value that data will provide to the enterprise.
In this tutorial, we will show you how to use Spotlight’s semantic layer capabilities to blend and enrich data from multiple sources, whether in the cloud, on-premise, or even local files. You are more than welcome to follow along with this tutorial using the Spotlight virtual lab. We also created a video overview if you rather sit back and watch.
Previewing the Datasets
While in Spotlight, you can work with any of the multiple sources supported by Datameer. In this tutorial, we will focus on Product data that we have stored in Amazon S3
and Customer data that we have stored in Snowflake.
Creating a Workspace
First, we need to create a workspace that we will use to model the data. Click on + Add New and then select Workspace.
Give the workspace a name and then click OK.
The newly created workspace will show up in the list of available assets, click on it to access it
Click on the plus sign to start adding datasets to it
From the Customerconnection select the Marketodataset and then click on Add to Workspace
Next, click on +Add…on the top left corner of the workspace and then select Data
Then from the Product Dataset 2021connection I’ll select the Sales opportunitiesdataset and click on Add to Workspace.
Now with both datasets on the Workspace, we will click on Open Workbench.
Modeling the Dataset
Using Spotlight’s Workbench allows you to view data from the references and datasets in your Workspace, then use that data to create new datasets. New datasets can be edited withoperations, used elsewhere in Spotlight, or opened inexternal tools like Tableau, or Jupyter for further analysis.
Once we are in the Workbench, we are going to create a new dataset, then will use it to blend both datasets (Marketo and Sales opportunities)
Click on the plus sign on the selected dataset
Now we need to add an operation, click add operation.
Then select, blend.
From the blending settings, select one of the datasources in the workbench, and then select the blend mode that fits your business needs.
Click on use suggested columns
And then create the blend, the resulting blend should look like this:
You can repeat this process as many times as needed with other data assets in your Spotlight environment.
Visualizing the Data
At this point you are ready to visualize the dataset using any of the Spotlight-supported BI or data science tools. To do so, select any of the tools available in the workbench, create the connection between Spotlight and your tool and start visualizing the dataset you just created.
The steps that we took in this tutorial allowed us connect and blend data from Snowflake and Amazon S3. You can use the same process to work with any data regardless of its location. Using the multiple operations available in Spotlight, you can model your data to fit the business needs of your use case.
To learn more about the modeling operations included in Spotlight please check out our documentation. As always, we look forward to your feedback. Please get in touch if you have any questions, comments, or other ideas.