How can I query my staged data in Snowflake?

  • How-Tos FAQs
  • October 26, 2022
snowflake faq

In this article, we are going to look at how to query staged data in Snowflake and what ways we can use Snowflake to tackle this as well as answer the following:

  • How can I view the data in my stage?
  • How do I query the data in my stage without first copying it into my target table?
  • I want to view the data in my stage before loading; how can I achieve this?

If you are a Snowflake user, you have most likely experienced working with Snowflake stages.

A stage serves as a holding area or locator for your source data.

The data file(s) moves from Source data → External /Internal Stage → Target Table 

A familiar hiccup users encounter is the inability to view the contents of the Snowflake stage before loading

To answer this, let’s provide some context.

But first – Datameer: Your Snowflake SQL Visualization Tool

Have you ever written a lengthy SQL script on your worksheets?

Or

Had to create multiple worksheets to manage your various SQL transformations.

Or

Or timetravelled through different query profiles to find out who did what, when, and how?

Sure you have.

Datameer , the world’s first multi-persona, Snowflake-integrated SQL visualization tool, is here to help.

Datameer, in collaboration with Snowflake, is an analytics stack built for Snowflake.

Powered by Snowflake, Datameer helps transform your Snowflake data quickly and efficiently. Datameer speeds up your modeling process, helps deliver more analytics, and increases your Snowflake ROI.

Here are some salient features of Datameer for Snowflake:

  •  A SQL data visualization tool for Snowflake : Datameer is a SQL data visualization tool. With Datameer, drag-and-drop features can simulate SQL modeling. 

As a technical data analyst, you don’t have to be bogged down about always having to recall what type of join, function, or SQL syntax to use. Datameer handles that for you.

  • Catalog-like Snowflake Data Documentation : Datameer automatically documents system-level metadata and properties, allowing teams to discover and share knowledge about their Snowflake data models easily.
  • Tight-Knit Integration With Snowflake: Datameer offers a way to build models directly into Snowflake without worrying about data reconciliation or loss. 

Due to Datameer’s tight integration with Snowflake, all transformations and modeling leverage the Snowflake engine and scalability, resulting in high data security and easy governance.

  • Ad-hoc analysis, easy scheduling, and sharing of Snowflake reports :  Datameer provides top-notch collaboration features that help to increase productivity and avoid coding rework within data teams.

With Datameer, one can get from raw Snowflake data to shareable insights within minutes.

That’s just a few tips from the world of snowflaking with Datameer.

Sounds too good be good to be true?

Plug in Datameer and make the most of your Snowflake modeling experience today!

What are Snowflake Stages?

Simply put, a Snowflake stage can be defined as an intermediary space for uploading/unloading source files.

The idea is to use the COPY command to load data from this transitional layer (stage) into a target (table).

Snowflake stages are of two types; Internal (table, named & user) stages AND external stages.

In this example, we will focus on external stages.

What is Snowflake Metadata – Why is it important?

Before we discuss more about Snowflake metadata, let’s revisit the concept of metadata in data warehousing.

The generally accepted definition of Metadata is “data about data.” Some liken metadata to an index of a book.

Metadata details the following:

  • Location, structure, and description of created warehouse objects
  • Information about the integration and transformation rules used to populate the data
  • Context around certain assets in your data warehouse

Querying Staged Files Using Metadata In Snowflake

In Snowflake, metadata is automatically generated for files in internal (Snowflake) or external (Amazon S3, Google Cloud Storage, or Microsoft Azure) stages.

This metadata is stored in virtual columns and can be queried using standard SQL.

By referencing metadata columns in a staged file, your staged file/query can return additional information, such as filename, row numbers, etc., about the file.

How To Query Staged Files Using Metadata In Snowflake

Querying Metadata From An External Stage

To show how this is done, we will cite a simple example.

Step 1 :- Examine the CSV to be loaded into our stage – Create a file format and a stage.

Step 2 :- Load Into our external stage

Step 3 :- Run a list command.

Step 4 :- — Query the filename, row number metadata columns, and other regular data columns in the staged file.

Querying Metadata From An Internal Stage

Step 1 :- Examine the CSV to be loaded into our stage – Create a file format and a stage.

Step 2 :- Load Into our internal stage using the PUT command and specified file format.

Step 3 :- Run a list command.

Step 4 :- — Query the filename, row number metadata columns, and other regular data columns in the staged file.

 

 Additional Considerations ⚠

  • Staged file data querying is primarily used to preview and inspect the contents of your file before loading or unloading.
  •  “ Stage querying” does not offer any write or aggregation capabilities; only simple read SQL statements and a few on-the-fly functions such as Ascii are supported.
  • Snowflake stores Metadata in the METADATA$FILENAME and METADATA$FILE_ROW_NUMBER columns.
    • METADATA$FILENAME: Name of the staged data file the current row belongs to, including the path to the data file in the stage.
    • METADATA$FILE_ROW_NUMBER: Row number for each record in the container staged data file.
  • Metadata columns can only be queried by name; as such, they are not included in the output of any of the following statements:
    • SELECT *
    • SHOW <objects>
    • DESCRIBE <object>
    • Queries on INFORMATION_SCHEMA views

Related Posts

Top 5 Snowflake tools for Analysts- talend

Top 5 Snowflake Tools for Analysts

  • Ndz Anthony
  • February 26, 2024